Emerson Network Power today announced that it is entering into a significant partnership with IBM to both integrate Emerson’s new Trellis DCIM suite into IBM’s ITSM products as well as to jointly sell Trellis to IBM customers. This partnership has the potential to reshape the DCIM market segment for several reasons:
Connection to enterprise IT — Emerson has sold a lot of chillers, UPS and PDU equipment and has tremendous cachet with facilities types, but they don’t have a lot of people who know how to talk IT. IBM has these people in spades.
IBM can use a DCIM offering — IBM, despite being a huge player in the IT infrastructure and data center space, does not have a DCIM product. Its Maximo product seems to be more of a dressed up asset management product, and this partnership is an acknowledgement of the fact that to build a full-fledged DCIM product would have been both expensive and time-consuming.
IBM adds sales bandwidth — My belief is that the development of the DCIM market has been delivery bandwidth constrained. Market leaders Nlyte, Emerson and Schneider do not have enough people to address the emerging total demand, and the host of smaller players are even further behind. IBM has the potential to massively multiply Emerson’s ability to deliver to the market.
This week, the New York Times ran a series of articles about data center power use (and abuse) “Power, Pollution and the Internet” (http://nyti.ms/Ojd9BV) and “Data Barns in a Farm Town, Gobbling Power and Flexing Muscle” (http://nyti.ms/RQDb0a). Among the claims made in the articles were that data centers were “only using 6 to 12 % of the energy powering their servers to deliver useful computation. Like a lot of media broadsides, the reality is more complex than the dramatic claims made in these articles. Technically they are correct in claiming that of the electricity going to a server, only a very small fraction is used to perform useful work, but this dramatic claim is not a fair representation of the overall efficiency picture. The Times analysis fails to take into consideration that not all of the power in the data center goes to servers, so the claim of 6% efficiency of the servers is not representative of the real operational efficiency of the complete data center.
On the other hand, while I think the Times chooses drama over even-keeled reporting, the actual picture for even a well-run data center is not as good as its proponents would claim. Consider:
A new data center with a PUE of 1.2 (very efficient), with 83% of the power going to IT workloads.
Then assume that 60% of the remaining power goes to servers (storage and network get the rest), for a net of almost 50% of the power going into servers. If the servers are running at an average utilization of 10%, then only 10% of 50%, or 5% of the power is actually going to real IT processing. Of course, the real "IT number" is the server + plus storage + network, so depending on how you account for them, the IT usage could be as high as 38% (.83*.4 + .05).
Only a few months since I authored Forrester’s "Market Overview: Data Center Infrastructure Management Solutions," significant changes merit some additional commentary.
The major vendor drama of the “season” is the continued evolution of Schneider and Emerson’s DCIM product rollout. Since Schneider’s worldwide analyst conference in Paris last week, we now have pretty good visibility into both major vendors' strategy and products. In a nutshell, we have two very large players, both with large installed bases of data center customers, and both selling a vision of an integrated modular DCIM framework. More importantly it appears that both vendors can deliver on this promise. That is the good news. The bad news is that their offerings are highly overlapped, and for most potential customers the choice will be a difficult one. My working theory is that whoever has the largest footprint of equipment will have an advantage, and that a lot depends on the relative execution of their field marketing and sales organizations as both companies rush to turn 1000s of salespeople and partners loose on the world with these products. This will be a classic market share play, with the smart strategy being to sacrifice margin for market share, since DCIM solutions have a high probability of pulling through services, and usually involve some annuity revenue stream from support and update fees.
This week AMD finally released their AMD 6200 and 4200 series CPUs. These are the long-awaited server-oriented Interlagos and Valencia CPUs, based on their new “Bulldozer” core, offering up to 16 x86 cores in a single socket. The announcement was targeted at (drum roll, one guess per customer only) … “The Cloud.” AMD appears to be positioning its new architectures as the platform of choice for cloud-oriented workloads, focusing on highly threaded throughput oriented benchmarks that take full advantage of its high core count and unique floating point architecture, along with what look like excellent throughput per Watt metrics.
At the same time it is pushing the now seemingly mandatory “cloud” message, AMD is not ignoring the meat-and-potatoes enterprise workloads that have been the mainstay of server CPUs sales –virtualization, database, and HPC, where the combination of many cores, excellent memory bandwidth and large memory configurations should yield excellent results. In its competitive comparisons, AMD targets Intel’s 5640 CPU, which it claims represents Intel’s most widely used Xeon CPU, and shows very favorable comparisons in regards to performance, price and power consumption. Among the features that AMD cites as contributing to these results are:
Advanced power and thermal management, including the ability to power off inactive cores contributing to an idle power of less than 4.4W per core. Interlagos offers a unique capability called TDP, which allows I&O groups to set the total power threshold of the CPU in 1W increments to allow fine-grained tailoring of power in the server racks.
Turbo CORE, which allows boosting the clock speed of cores by up to 1 GHz for half the cores or 500 MHz for all the cores, depending on workload.
Emerging ARM server Calxeda has been hinting for some time that they had a significant partnership announcement in the works, and while we didn’t necessarily not believe them, we hear a lot of claims from startups telling us to “stay tuned” for something big. Sometimes they pan out, sometimes they simply go away. But this morning Calxeda surpassed our expectations by unveiling just one major systems partner – but it just happens to be Hewlett Packard, which dominates the WW market for x86 servers.
At its core (unintended but not bad pun), the HP Hyperscale business unit Project Moonshot and Calxeda’s server technology are about improving the efficiency of web and cloud workloads, and promises improvements in excess of 90% in power efficiency and similar improvements in physical density compared with current x86 solutions. As I noted in my first post on ARM servers and other documents, even if these estimates turn out to be exaggerated, there is still a generous window within which to do much, much, better than current technologies. And workloads (such as memcache, Hadoop, static web servers) will be selected for their fit to this new platform, so the workloads that run on these new platforms will potentially come close to the cases quoted by HP and Calxeda.
I recently published an update on power and cooling in the data center (http://www.forrester.com/go?docid=60817), and as I review it online, I am struck by the combination of old and new. The old – the evolution of semiconductor technology, the increasingly elegant attempts to design systems and components that can be incrementally throttled, and the increasingly sophisticated construction of the actual data centers themselves, with increasing modularity and physical efficiency of power and cooling.
The new is the incredible momentum I see behind Data Center Infrastructure Management software. In a few short years, DCIM solutions have gone from simple aggregated viewing dashboards to complex software that understands tens of thousands of components, collects, filters and analyzes data from thousands of sensors in a data center (a single CRAC may have in excess of 20 sensors, a server over a dozen, etc.) and understands the relationships between components well enough to proactively raise alarms, model potential workload placement and make recommendations about prospective changes.
Of all the technologies reviewed in the document, DCIM offers one of the highest potentials for improving overall efficiency without sacrificing reliability or scalability of the enterprise data center. While the various DCIM suppliers are still experimenting with business models, I think that it is almost essential for any data center operations group that expects significant change, be it growth, shrinkage, migration or a major consolidation or cloud project, to invest in DCIM software. DCIM consumers can expect to see major competitive action among the current suppliers, and there is a strong potential for additional consolidation.
Calxeda, one of the most visible stealth mode startups in the industry, has finally given us an initial peek at the first iteration of its server plans, and they both meet our inflated expectations from this ARM server startup and validate some of the initial claims of ARM proponents.
While still holding their actual delivery dates and details of specifications close to their vest, Calxeda did reveal the following cards from their hand:
The first reference design, which will be provided to OEM partners as well as delivered directly to selected end users and developers, will be based on an ARM Cortex A9 quad-core SOC design.
The SOC, as Calxeda will demonstrate with one of its reference designs, will enable OEMs to design servers as dense as 120 ARM quad-core nodes (480 cores) in a 2U enclosure, with an average consumption of about 5 watts per node (1.25 watts per core) including DRAM.
While not forthcoming with details about the performance, topology or protocols, the SOC will contain an embedded fabric for the individual quad-core SOC servers to communicate with each other.
Most significantly for prospective users, Calxeda is claiming, and has some convincing models to back up these claims, that they will provide a performance advantage of 5X to 10X the performance/watt and (even higher when price is factored in for a metric of performance/watt/$) of any products they expect to see when they bring the product to market.
Intel, despite a popular tendency to associate a dominant market position with indifference to competitive threats, has not been sitting still waiting for the ARM server phenomenon to engulf them in a wave of ultra-low-power servers. Intel is fiercely competitive, and it would be silly for any new entrants to assume that Intel will ignore a threat to the heart of a high-growth segment.
In 2009, Intel released a microserver specification for compact low-power servers, and along with competitor AMD, it has been aggressive in driving down the power envelope of its mainstream multicore x86 server products. Recent momentum behind ARM-based servers has heated this potential competition up, however, and Intel has taken the fight deeper into the low-power realm with the recent introduction of the N570, a an existing embedded low-power processor, as a server CPU aimed squarely at emerging ultra-low-power and dense servers. The N570, a dual-core Atom processor, is being currently used by a single server partner, ultra-dense server manufacturer SeaMicro (see Little Servers For Big Applications At Intel Developer Forum), and will allow them to deliver their current 512 Atom cores with half the number of CPU components and some power savings.
Technically, the N570 is a dual-core Atom CPU with 64 bit arithmetic, a differentiator against ARM, and the same 32-bit (4 GB) physical memory limitations as current ARM designs, and it should have a power dissipation of between 8 and 10 watts.
From nothing more than an outlandish speculation, the prospects for a new entrant into the volume Linux and Windows server space have suddenly become much more concrete, culminating in an immense buzz at CES as numerous players, including NVIDIA and Microsoft, stoked the fires with innuendo, announcements, and demos.
Consumers of x86 servers are always on the lookout for faster, cheaper, and more power-efficient servers. In the event that they can’t get all three, the combination of cheaper and more energy-efficient seems to be attractive to a large enough chunk of the market to have motivated Intel, AMD, and all their system partners to develop low-power chips and servers designed for high density compute and web/cloud environments. Up until now the debate was Intel versus AMD, and low power meant a CPU with four cores and a power dissipation of 35 – 65 Watts.
The Promised Land
The performance trajectory of processors that were formerly purely mobile device processors, notably the ARM Cortex, has suddenly introduced a new potential option into the collective industry mindset. But is this even a reasonable proposition, and if so, what does it take for it to become a reality?
Our first item of business is to figure out whether or not it even makes sense to think about these CPUs as server processors. My quick take is yes, with some caveats. The latest ARM offering is the Cortex A9, with vendors offering dual core products at up to 1.2 GHz currently (the architecture claims scalability to four cores and 2 GHz). It draws approximately 2W, much less than any single core x86 CPU, and a multi-core version should be able to execute any reasonable web workload. Coupled with the promise of embedded GPUs, the notion of a server that consumes much less power than even the lowest power x86 begins to look attractive. But…
In a recent discussion with a group of infrastructure architects, power architecture, especially UPS engineering, was on the table as a topic. There was general agreement that UPS systems are a necessary evil, cumbersome and expensive beasts to put into a DC, and a lot of speculation on alternatives. There was general consensus that the goal was to develop a solution that would be more granular install and deploy and thus allow easier and ad-hoc decisions about which resources to protect, and agreement that battery technologies and current UPS architectures were not optimal for this kind of solution.
So what if someone were to suddenly expand battery technology R&D investment by a factor of maybe 100x of R&D and into battery technology, expand high-capacity battery production by a giant factor, and drive prices down precipitously? That’s a tall order for today’s UPS industry, but it’s happening now courtesy of the auto industry and the anticipated wave of plug-in hybrid cars. While batteries for cars and batteries for computers certainly have their differences in terms of depth and frequency of charge/discharge cycles, packaging, lifespan, etc, there is little doubt that investments in dense and powerful automotive batteries and power management technology will bleed through into the data center. Throw in recent developments in high-charge capacitors (referred to in the media as “super capacitors”), which add the impedance match between the requirements for spike demands and a chemical battery’s dislike of sudden state changes, and you have all the foundational ingredients for major transformation in the way we think about supplying backup power to our data center components.