On April 23, IBM rolled out the long-awaited POWER8 CPU, the successor to POWER7+, and given the extensive pre-announcement speculation, the hardware itself was no big surprise (the details are fascinating, but not suitable for this venue), offering an estimated 30 - 50% improvement in application performance over the latest POWER7+, with potential for order of magnitude improvements with selected big data and analytics workloads. While the technology is interesting, we are pretty numb to the “bigger, better, faster” messaging that inevitably accompanies new hardware announcements, and the real impact of this announcement lies in its utility for current AIX users and IBM’s increased focus on Linux and its support of the OpenPOWER initiative.
OK, so we’re numb, but it’s still interesting. POWER8 is an entirely new processor generation implemented in 22 nm CMOS (the same geometry as Intel’s high-end CPUs). The processor features up to 12 cores, each with up to 8 threads, and a focus on not only throughput but high performance per thread and per core for low-thread-count applications. Added to the mix is up to 1 TB of memory per socket, massive PCIe 3 I/O connectivity and Coherent Accelerator Processor Interface (CAPI), IBM’s technology to deliver memory-controller-based access for accelerators and flash memory in POWER systems. CAPI figures prominently in IBM’s positioning of POWER as the ultimate analytics engine, with the announcement profiling the performance of a configuration using 40 TB of CAPI-attached flash for huge in-memory analytics at a fraction of the cost of a non-CAPI configuration.[i]
A Slam-dunk for AIX users and a new play for Linux
This week, IBM announced its new line of x86 servers, and included among the usual incremental product improvements is a performance game-changer called eXFlash. eXFlash is the first commercially available implantation of the MCS architecture announced last year by Diablo Technologies. The MCS architecture, and IBM’s eXFlash offering in particular, allows flash memory to be embedded on the system as close to the CPU as main memory, with latencies substantially lower than any other available flash options, offering better performance at a lower solution cost than other embedded flash solutions. Key aspects of the announcement include:
■ Flash DIMMs offer scalable high performance. Write latency (a critical metric) for IBM eXFlash will be in the 5 to 10 microsecond range, whereas best-of-breed competing mezzanine card and PCIe flash can only offer 15 to 20 microseconds (and external flash storage is slower still). Additionally, since the DIMMs are directly attached to the memory controller, flash I/O does not compete with other I/O on the system I/O hub and PCIe subsystem, improving overall system performance for heavily-loaded systems. Additional benefits include linear performance scalability as the number of DIMMs increase and optional built-in hardware mirroring of DIMM pairs.
■ eXFlash DIMMs are compatible with current software. Part of the magic of MCS flash is that it appears to the OS as a standard block-mode device, so all existing block-mode software will work, including applications, caching and tiering or general storage management software. For IBM users, compatibility with IBM’s storage management and FlashCache Storage Accelerator solutions is guaranteed. Other vendors will face zero to low effort in qualifying their solutions.
My Forrester colleagues Ted Schadler and John McCarthy have written about the differences between Systems of Reference (SoR) and Systems of Engagement (SoE) in the customer-facing systems and mobility, but after further conversations with some very smart people at IBM, I think there are also important reasons for infrastructure architects to understand this dichotomy. Scalable and flexible systems of engagement, engagement, built with the latest in dynamic web technology and the back-end systems of record, highly stateful usually transactional systems designed to keep track of the “true” state of corporate assets are very different animals from an infrastructure standpoint in two fundamental areas:
Suitability to cloud (private or public) deployment – SoE environments, by their nature, are generally constructed using horizontally scalable technologies, generally based on some level of standards including web standards, Linux or Windows OS, and some scalalable middleware that hides the messy details of horizontally scaling a complex application. In addition, the workloads are generally highly parallel, with each individual interaction being of low value. This characteristic leads to very different demands on the necessity for consistency and resiliency.
When I returned to Forrester in mid-2010, one of the first blog posts I wrote was about Oracle’s new roadmap for SPARC and Solaris, catalyzed by numerous client inquiries and other interactions in which Oracle’s real level of commitment to future SPARC hardware was the topic of discussion. In most cases I could describe the customer mood as skeptical at best, and panicked and committed to migration off of SPARC and Solaris at worst. Nonetheless, after some time spent with Oracle management, I expressed my improved confidence in the new hardware team that Oracle had assembled and their new roadmap for SPARC processors after the successive debacles of the UltraSPARC-5 and Rock processors under Sun’s stewardship.
Two and a half years later, it is obvious that Oracle has delivered on its commitments regarding SPARC and is continuing its investments in SPARC CPU and system design as well as its Solaris OS technology. The latest evolution of SPARC technology, the SPARC T5 and the soon-to-be-announced M5, continue the evolution and design practices set forth by Oracle’s Rick Hetherington in 2010 — incremental evolution of a common set of SPARC cores, differentiation by variation of core count, threads and cache as opposed to fundamental architecture, and a reliable multi-year performance progression of cores and system scalability.
If you have dismissed Microsoft as a cloud platform player up to now, you might want to rethink that notion. With the latest release of Windows Azure here at Build, Microsoft’s premier developer shindig, this cloud service has become a serious contender for the top spot in cloud platforms. And all the old excuses that may have kept you away are quickly being eliminated.
In typical Microsoft fashion, the Redmond, Washington giant is attacking the cloud platform market with a competitive furor that can only be described as faster follower. In 2008, Microsoft quickly saw the disruptive change that Amazon Web Services (AWS) represented and accelerated its own lab project centered around delivering Windows as a cloud platform. Version 1.0 of Azure was decidedly different and immature and thus struggled to establish its place in the market. But with each iteration, Microsoft has expanded Azure’s applicability, appeal, and maturity. And the pace of change for Windows Azure has accelerated dramatically under the new leadership of Satya Nadella. He came over from the consumer Internet services side of Microsoft, where new features and capabilities are normally released every two weeks — not every two years, as had been the norm in the server and tools business prior to his arrival.
Nathan Bedford Forrest, a Confederate general of despicable ideology and consummate tactics, spoke of “keepin up the skeer,” applying continued pressure to opponents to prevent them from regrouping and counterattacking. POWER7+, the most recent version of IBM’s POWER architecture, anticipated as a follow-up to the POWER7 for almost a year, was finally announced this week, and appears to be “keepin up the skeer” in terms of its competitive potential for IBM POWER-based systems. In short, it is a hot piece of technology that will keep existing IBM users happy and should help IBM maintain its impressive momentum in the Unix systems segment.
For the chip heads, the CPU is implemented in a 32 NM process, the same as Intel’s upcoming Poulson, and embodies some interesting evolutions in high-end chip design, including:
Use of DRAM instead of SRAM — IBM has pioneered the use of embedded DRAM (eDRAM) as embedded L3 cache instead of the more standard and faster SRAM. In exchange for the loss of speed, eDRAM requires fewer transistors and lower power, allowing IBM to pack a total of 80 MB (a lot) of shared L3 cache, far more than any other product has ever sported.
Every culture has its coming of age rituals — Confirmation, Bar Mitzvah, being hunted by tribal elders, surviving in the wilderness, driving at high speed while texting — all of which mark the progress from childhood to adulthood. In the high-tech world, one of the rituals marking the maturation of a company is the user group. When a company has a strategy it wants to communicate, a critical mass of customers, and prospects bright enough that it wants to highlight them rather than obscure them, it is time for a user group meeting.
This year, having passed a year since the acquisition of Novell by AttachMate and its subsequent instantiation as a standalone division, as well as being its 20th anniversary, SUSE had its first user group meeting. All in all, the portents were good, and SUSE got its core messages across to an audience of about 500 of its users as well as a cadre of the more sophisticated (IMHO) industry analysts.
Among My Key Takeaways:
SUSE is a stable company with rational management — With profitable revenues of over $200M and a publicly stated plan to hit $234 for the next fiscal year, SUSE is a reasonably sized company (technically a division of $1.3B Attachmate, but it looks and acts like an independent company), with growth rates that look to be a couple of points higher than its segment.
SUSE’s management has done an excellent job of focusing the company — SUSE, acknowledging its size disadvantage over competitor Red Hat, has chosen to focus heavily on enterprise Linux, publicly disavowing desktop and mobile device directions. SUSE’s claim is that their market share in the core enterprise segment is larger than their overall market share compared to Red Hat. This is a hard number to even begin to tweeze out, but it feels like a reasonable claim.
I said last year that this would happen sometime in the first half of this year, but for some reason my colleagues and clients have kept asking me exactly when we would see a real ARM server running a real OS. How about now?
To copy from Calxeda’s most recent blog post:
“This week, Calxeda is showing a live Calxeda cluster running Ubuntu 12.04 LTS on real EnergyCore hardware at the Ubuntu Developer and Cloud Summit events in Oakland, CA. … This is the real deal; quad-core, w/ 4MB cache, secure management engine, and Calxeda’s fabric all up and running.”
This is a significant milestone for many reasons. It proves that Calxeda can indeed deliver a working server based on its scalable fabric architecture, although having HP signing up as a partner meant that this was essentially a non-issue, but still, proof is good. It also establishes that at least one Linux distribution provider, in this case Ubuntu, is willing to provide a real supported distribution. My guess is that Red Hat and Centos will jump on the bus fairly soon as well.
Most importantly, we can get on with the important work of characterizing real benchmarks on real systems with real OS support. HP’s discovery centers will certainly play a part in this process as well, and I am willing to bet that by the end of the summer we will have some compelling data on whether the ARM server will deliver on its performance and energy efficiency promises. It’s not a slam dunk guaranteed win – Intel has been steadily ratcheting up its energy efficiency, and the latest generation of x86 server from HP, IBM, Dell, and others show promise of much better throughput per watt than their predecessors. Add to that the demonstration of a Xeon-based system by Sea Micro (ironically now owned by AMD) that delivered Xeon CPUs at a 10 W per CPU power overhead, an unheard of efficiency.
In the latest evolution of its Linux push, IBM has added to its non-x86 Linux server line with the introduction of new dedicated Power 7 rack and blade servers that only run Linux. “Hah!” you say. “Power already runs Linux, and quite well according to IBM.” This is indeed true, but when you look at the price/performance of Linux on standard Power, the picture is not quite as advantageous, with the higher cost of Power servers compared to x86 servers offsetting much if not all of the performance advantage.
Enter the new Flex System p24L (Linux) Compute Node blade for the new PureFlex system and the IBM PowerLinuxTM 7R2 rack server. Both are dedicated Linux-only systems with 2 Power 7 6/8 core, 4 threads/core processors, and are shipped with unlimited licenses for IBM’s PowerVM hypervisor. Most importantly, these systems, in exchange for the limitation that they will run only Linux, are priced competitively with similarly configured x86 systems from major competitors, and IBM is betting on the improvement in performance, shown by IBM-supplied benchmarks, to overcome any resistance to running Linux on a non-x86 system. Note that this is a different proposition than Linux running on an IFL in a zSeries, since the mainframe is usually not the entry for the customer — IBM typically sells to customers with existing mainframe, whereas with Power Linux they will also be attempting to sell to net new customers as well as established accounts.
Today HP announced a new set of technology programs and future products designed to move x86 server technology for both Windows and Linux more fully into the realm of truly mission-critical computing. My interpretation of these moves is that it is both a combined defensive and pro-active offensive action on HP’s part that will both protect them as their Itanium/HP-UX portfolio slowly declines as well as offer attractive and potentially unique options for both current and future customers who want to deploy increasingly critical services on x86 platforms.
Bearing in mind that the earliest of these elements will not be in place until approximately mid-2012, the key elements that HP is currently disclosing are:
ServiceGuard for Linux – This is a big win for Linux users on HP, and removes a major operational and architectural hurdle for HP-UX migrations. ServiceGuard is a highly regarded clustering and HA facility on HP-UX, and includes many features for local and geographically distributed HA. The lack of ServiceGuard is often cited as a risk in HP-UX migrations. The availability of ServiceGuard by mid-2012 will remove yet another barrier to smooth migration from HP-UX to Linux, and will help make sure that HP retains the business as it migrates from HP-UX.
Analysis engine for x86 – Analysis engine is internal software that provides system diagnostics, predictive failure analysis and self-repair on HP-UX systems. With an uncommitted delivery date, HP will port this to selected x86 servers. My guess is that since the analysis engine probably requires some level of hardware assist, the analysis engine will be paired with the next item on the list…