This week AMD finally released their AMD 6200 and 4200 series CPUs. These are the long-awaited server-oriented Interlagos and Valencia CPUs, based on their new “Bulldozer” core, offering up to 16 x86 cores in a single socket. The announcement was targeted at (drum roll, one guess per customer only) … “The Cloud.” AMD appears to be positioning its new architectures as the platform of choice for cloud-oriented workloads, focusing on highly threaded throughput oriented benchmarks that take full advantage of its high core count and unique floating point architecture, along with what look like excellent throughput per Watt metrics.
At the same time it is pushing the now seemingly mandatory “cloud” message, AMD is not ignoring the meat-and-potatoes enterprise workloads that have been the mainstay of server CPUs sales –virtualization, database, and HPC, where the combination of many cores, excellent memory bandwidth and large memory configurations should yield excellent results. In its competitive comparisons, AMD targets Intel’s 5640 CPU, which it claims represents Intel’s most widely used Xeon CPU, and shows very favorable comparisons in regards to performance, price and power consumption. Among the features that AMD cites as contributing to these results are:
Advanced power and thermal management, including the ability to power off inactive cores contributing to an idle power of less than 4.4W per core. Interlagos offers a unique capability called TDP, which allows I&O groups to set the total power threshold of the CPU in 1W increments to allow fine-grained tailoring of power in the server racks.
Turbo CORE, which allows boosting the clock speed of cores by up to 1 GHz for half the cores or 500 MHz for all the cores, depending on workload.
Emerging ARM server Calxeda has been hinting for some time that they had a significant partnership announcement in the works, and while we didn’t necessarily not believe them, we hear a lot of claims from startups telling us to “stay tuned” for something big. Sometimes they pan out, sometimes they simply go away. But this morning Calxeda surpassed our expectations by unveiling just one major systems partner – but it just happens to be Hewlett Packard, which dominates the WW market for x86 servers.
At its core (unintended but not bad pun), the HP Hyperscale business unit Project Moonshot and Calxeda’s server technology are about improving the efficiency of web and cloud workloads, and promises improvements in excess of 90% in power efficiency and similar improvements in physical density compared with current x86 solutions. As I noted in my first post on ARM servers and other documents, even if these estimates turn out to be exaggerated, there is still a generous window within which to do much, much, better than current technologies. And workloads (such as memcache, Hadoop, static web servers) will be selected for their fit to this new platform, so the workloads that run on these new platforms will potentially come close to the cases quoted by HP and Calxeda.
There has been a lot of ill-considered press coverage about the “death” of UNIX and coverage of the wholesale migration of UNIX workloads to LINUX, some of which (the latter, not the former) I have contributed to. But to set the record straight, the extinction of UNIX is not going to happen in our lifetime.
While UNIX revenues are not growing at any major clip, it appears as if they have actually had a slight uptick over the past year, probably due to a surge by IBM, and seem to be nicely stuck around the $18 - 20B level annual range. But what is important is the “why,” not the exact dollar figure.
UNIX on proprietary RISC architectures will stay around for several reasons that primarily revolve around their being the only close alternative to mainframes in regards to specific high-end operational characteristics:
Performance – If you need the biggest single-system SMP OS image, UNIX is still the only realistic commercial alternative other than mainframes.
Isolated bulletproof partitionability – If you want to run workload on dynamically scalable and electrically isolated partitions with the option to move workloads between them while running, then UNIX is your answer.
Near-ultimate availability – If you are looking for the highest levels of reliability and availability ex mainframes and custom FT systems, UNIX is the answer. It still possesses slight availability advantages, especially if you factor in the more robust online maintenance capabilities of the leading UNIX OS variants.
I just spent several days at Dell World, and came away with the impression of a company that is really trying to change its image. Old Dell was boxes, discounts and low cost supply chain. New Dell is applications, solution, cloud (now there’s a surprise!) and investments in software and integration. OK, good image, but what’s the reality? All in all, I think they are telling the truth about their intentions, and their investments continue to be aligned with these intentions.
As I wrote about a year ago, Dell seems to be intent on climbing up the enterprise food chain. It’s investment in several major acquisitions, including Perot Systems for services and a string of advanced storage, network and virtual infrastructure solution providers has kept the momentum going, and the products have been following to market. At the same time I see solid signs of continued investment in underlying hardware, and their status as he #1 x86 server vendor in N. America and #2 World-Wide remains an indication of their ongoing success in their traditional niches. While Dell is not a household name in vertical solutions, they have competent offerings in health care, education and trading, and several of the initiatives I mentioned last year are definitely further along and more mature, including continued refinement of their VIS offerings and deep integration of their much-improved DRAC systems management software into mainstream management consoles from VMware and Microsoft.
OK, out of respect for your time, now that I’ve caught you with a title that promises some drama I’ll cut to the chase and tell you that I definitely lean toward the former. Having spent a couple of days here at Oracle Open World poking around the various flavors of Engineered Systems, including the established Exadata and Exalogic along with the new SPARC Super Cluster (all of a week old) and the newly announced Exalytic system for big data analytics, I am pretty convinced that they represent an intelligent and modular set of optimized platforms for specific workloads. In addition to being modular, they give me the strong impression of a “composable” architecture – the various elements of processing nodes, Oracle storage nodes, ZFS file nodes and other components can clearly be recombined over time as customer requirements dictate, either as standard products or as custom configurations.
Well actually I meant mobs of flash, but I couldn’t resist the word play. Although, come to think of it, flash mobs might be the right way to describe the density of flash memory system vendors here at Oracle Open World. Walking around the exhibits it seems as if every other booth is occupied by someone selling flash memory systems to accelerate Oracle’s database, and all of them claiming to be: 1) faster than anything that Oracle, who already integrates flash into its systems, offers, and 2) faster and/or cheaper than the other flash vendor two booths down the aisle.
All joking aside, the proliferation of flash memory suppliers is pretty amazing, although a venue devoted to the world’s most popular database would be exactly where you might expect to find them. In one sense flash is nothing new – RAM disks, arrays of RAM configured to mimic a disk, have been around since the 1970s but were small and really expensive, and never got on a cost and volume curve to drive them into a mass-market product. Flash, benefitting not only from the inherent economies of semiconductor technology but also from the drivers of consumer volumes, has the transition to a cost that makes it a reasonable alternative for some use case, with database acceleration being probably the most compelling. This explains why the flash vendors are gathered here in San Francisco this week to tout their wares – this is the richest collection of potential customers they will ever see in one place.
In the good old days, computer industry trade shows were bigger than life events – booths with barkers and actors, ice cream and espresso bars and games in the booth, magic acts and surging crowds gawking at technology. In recent years, they have for the most part become sad shadows of their former selves. The great SHOWS are gone, replaced with button-down vertical and regional events where you are lucky to get a pen or a miniature candy bar for your troubles.
Enter Oracle OpenWorld. Mix 45,000 people, hundreds of exhibitors, one of the world’s largest software and systems company looking to make an impression, and you have the new generation of technology extravaganza. The scale is extravagant, taking up the entire Moscone Center complex (N, S and W) along with a couple of hotel venues, closing off a block of a major San Francisco street for a week, and throwing a little evening party for 20 or 30 thousand people.
But mixed with the hoopla, which included wheel of fortune giveaways that had hundreds of people snaking around the already crowded exhibition floor in serpentine lines, mini golf and whack-a-mole-games in the exhibit booths along with the aforementioned espresso and ice cream stands, there was genuine content and the public face of some significant trends. So far, after 24 hours, some major messages come through loud and clear:
I recently published an update on power and cooling in the data center (http://www.forrester.com/go?docid=60817), and as I review it online, I am struck by the combination of old and new. The old – the evolution of semiconductor technology, the increasingly elegant attempts to design systems and components that can be incrementally throttled, and the increasingly sophisticated construction of the actual data centers themselves, with increasing modularity and physical efficiency of power and cooling.
The new is the incredible momentum I see behind Data Center Infrastructure Management software. In a few short years, DCIM solutions have gone from simple aggregated viewing dashboards to complex software that understands tens of thousands of components, collects, filters and analyzes data from thousands of sensors in a data center (a single CRAC may have in excess of 20 sensors, a server over a dozen, etc.) and understands the relationships between components well enough to proactively raise alarms, model potential workload placement and make recommendations about prospective changes.
Of all the technologies reviewed in the document, DCIM offers one of the highest potentials for improving overall efficiency without sacrificing reliability or scalability of the enterprise data center. While the various DCIM suppliers are still experimenting with business models, I think that it is almost essential for any data center operations group that expects significant change, be it growth, shrinkage, migration or a major consolidation or cloud project, to invest in DCIM software. DCIM consumers can expect to see major competitive action among the current suppliers, and there is a strong potential for additional consolidation.
Last year I wrote about Oracle’s new plans for SPARC, anchored by a new line of SPARC CPUs engineered in conjunction with Fujitsu (Does SPARC have a Future?), and commented that the first deliveries of this new technology would probably be in early 2012, and until we saw this tangible evidence of Oracle’s actual execution of this road map we could not predict with any confidence the future viability of SPARC.
The T4 CPU
Fast forward a year and Oracle has delivered the first of the new CPUs, ahead of schedule and with impressive gains in performance that make it look like SPARC will remain a viable platform for years. Specifically, Oracle has introduced the T4 CPU and systems based on them. The T4, an evolution of Oracle’s highly threaded T-Series architecture, is implemented with an entirely new core that will form the basis, with variations in number of threads versus cores and cache designs, of the future M and T series systems. The M series will have fewer threads and more performance per thread, while the T CPUs will, like their predecessors, emphasize throughput for highly threaded workloads. The new T4 will have 8 cores, and each core will have 8 threads. While the T4 emphasizes highly threaded workload performance, it is important to note that Oracles has radically improved single-thread performance over its predecessors, with Oracle claiming performance per thread improvements of 5X over its predecessors, greatly improving its utility as a CPU to power less thread-intensive workloads as well.
At the Hot Chips conference last week, Intel disclosed additional details about the upcoming Poulson Itanium CPU due for shipment early next year. For Itanium loyalists (essentially committed HP-UX customers) the disclosures are a ray of sunshine among the gloomy news that has been the lot of Itanium devotees recently.
Poulson will bring several significant improvements to Itanium in both performance and reliability. On the performance side, we have significant improvements on several fronts:
Process – Poulson will be manufactured with the same 32 nm semiconductor process that will (at least for a while) be driving the high-end Xeon processors. This is goodness all around – performance will improve and Intel now can load its latest production lines more efficiently.
More cores and parallelism – Poulson will be an 8-core processor with a whopping 54 MB of on-chip cache, and Intel has doubled the width of the multi-issue instruction pipeline, from 6 to 12 instructions. Combined with improved hyperthreading, the combination of 2X cores and 2X the total number of potential instructions executed per clock cycle by each core hints at impressive performance gains.
Architecture and instruction tweaks – Intel has added additional instructions based on analysis of workloads. This kind of tuning of processor architectures seldom results in major gains in performance, but every small increment helps.