The Background – Linux as a Fast Follower and the Need for Hot Patching
No doubt about it, Linux has made impressive strides in the last 15 years, gaining many features previously associated with high-end proprietary Unix as it made the transition from small system plaything to core enterprise processing resource and the engine of the extended web as we know it. Along the way it gained reliable and highly scalable schedulers, a multiplicity of efficient and scalable file systems, advanced RAS features, its own embedded virtualization and efficient thread support.
As Linux grew, so did supporting hardware, particularly the capabilities of the ubiquitous x86 CPU upon which the vast majority of Linux runs today. But the debate has always been about how close Linux could get to "the real OS", the core proprietary Unix variants that for two decades defined the limits of non-mainframe scalability and reliability. But "the times they are a changing", and the new narrative may be "when will Unix catch up to Linux on critical RAS features like hot patching".
Hot patching, the ability to apply updates to the OS kernel while it is running, is a long sought-after but elusive feature of a production OS. Long sought after because both developers and operations teams recognize that bringing down an OS instance that is doing critical high-volume work is at best disruptive and worst a logistical nightmare, and elusive because it is incredibly difficult. There have been several failed attempts, and several implementations that "almost worked" but were so fraught with exceptions that they were not really useful in production.[i]
When I returned to Forrester in mid-2010, one of the first blog posts I wrote was about Oracle’s new roadmap for SPARC and Solaris, catalyzed by numerous client inquiries and other interactions in which Oracle’s real level of commitment to future SPARC hardware was the topic of discussion. In most cases I could describe the customer mood as skeptical at best, and panicked and committed to migration off of SPARC and Solaris at worst. Nonetheless, after some time spent with Oracle management, I expressed my improved confidence in the new hardware team that Oracle had assembled and their new roadmap for SPARC processors after the successive debacles of the UltraSPARC-5 and Rock processors under Sun’s stewardship.
Two and a half years later, it is obvious that Oracle has delivered on its commitments regarding SPARC and is continuing its investments in SPARC CPU and system design as well as its Solaris OS technology. The latest evolution of SPARC technology, the SPARC T5 and the soon-to-be-announced M5, continue the evolution and design practices set forth by Oracle’s Rick Hetherington in 2010 — incremental evolution of a common set of SPARC cores, differentiation by variation of core count, threads and cache as opposed to fundamental architecture, and a reliable multi-year performance progression of cores and system scalability.
On Tuesday November 8, after more than a year of pre-announcement disclosures that eventually left very little to the imagination, Intel finally announced the Itanium 9500, formerly known as Poulson. Added to this was the big surprise of HP announcing a refresh of its current line of Integrity servers, from blades to the large Superdome servers, with the new Itanium 9500.
As noted in an earlier post, the Itanium 9500 offers considerable performance improvements over its predecessors, and instantiated in HP’s new Integrity line it is positioned as delivering between 2X and 3X the performance per socket as previous Itanium 9300 (Tukwilla) systems at approximately the same price. For those remaining committed to Itanium and its attendant OS platforms, notably HP-UX, this is unmitigated good news. The fly in the ointment (I have never seen a fly in any ointment, but it does sound gross), of course, is HP’s dispute with Oracle. Despite the initial judgment in HP’s favor, the trial is a) not over yet, and b) Oracle has already filed for an early appeal of the initial verdict, which would ordinarily have to wait until the second phase of the trial, scheduled for next year, to finish. The net takeaway is that Oracle’s future availability on Itanium and HP-UX is not yet assured, so we really cannot advise the large number of Oracle users who will require Oracle 12 and later versions to relax yet.
Nathan Bedford Forrest, a Confederate general of despicable ideology and consummate tactics, spoke of “keepin up the skeer,” applying continued pressure to opponents to prevent them from regrouping and counterattacking. POWER7+, the most recent version of IBM’s POWER architecture, anticipated as a follow-up to the POWER7 for almost a year, was finally announced this week, and appears to be “keepin up the skeer” in terms of its competitive potential for IBM POWER-based systems. In short, it is a hot piece of technology that will keep existing IBM users happy and should help IBM maintain its impressive momentum in the Unix systems segment.
For the chip heads, the CPU is implemented in a 32 NM process, the same as Intel’s upcoming Poulson, and embodies some interesting evolutions in high-end chip design, including:
Use of DRAM instead of SRAM — IBM has pioneered the use of embedded DRAM (eDRAM) as embedded L3 cache instead of the more standard and faster SRAM. In exchange for the loss of speed, eDRAM requires fewer transistors and lower power, allowing IBM to pack a total of 80 MB (a lot) of shared L3 cache, far more than any other product has ever sported.
This week the California courts handed down a nice present for HP — a verdict confirming that Oracle was required to continue to deliver its software on HP’s Itanium-based Integrity servers. This was a major victory for HP, on the face of it giving them the prize they sought — continued availability of Oracle’s eponymous database on their high-end systems.
However, HP’s customers should not immediately assume that everything has returned to a “status quo ante.” Once Humpty Dumpty has fallen off the wall it is very difficult to put the pieces together again. As I see it, there are still three major elephants in the room that HP users must acknowledge before they make any decisions:
Oracle will appeal, and there is no guarantee of the outcome. The verdict could be upheld or it could be reversed. If it is upheld, then that represents a further delay in the start date from which Oracle will be measured for its compliance with the court ordered development. Oracle will also continue to press its counterclaims against HP, but those do not directly relate to the continued development or Oracle software on Itanium.
Itanium is still nearing the end of its road map. A reasonable interpretation of the road map tea leaves that have been exposed puts the final Itanium release at about 2015 unless Intel decides to artificially split Kittson into two separate releases. Integrity customers must take this into account as they buy into the architecture in the last few years of Itanium’s life, although HP can be depended on to offer high-quality support for a decade after the last Itanium CPU rolls off Intel’s fab lines. HP has declared its intention to produce Integrity-level x86 systems, but OS support intentions are currently stated as Linux and Windows, not HP-UX.
OK, it’s time to stretch the 2012 writing muscles, and what better way to do it than with the time honored “retrospective” format. But rather than try and itemize all the news and come up with a list of maybe a dozen or more interesting things, I decided instead to pick the best and the worst – events and developments that show the amazing range of the technology business, its potentials and its daily frustrations. So, drum roll, please. My personal nomination for the best and worst of the year (along with a special extra bonus category) are:
The Best – IBM Watson stomps the world’s best human players in Jeopardy. In early 2011, IBM put its latest deep computing project, Watson, up against some of the best players in the world in a game of Jeopardy. Watson, consisting of hundreds of IBM Power CPUs, gazillions of bytes of memory and storage, and arguably the most sophisticated rules engine and natural language recognition capability ever developed, won hands down. If you haven’t seen the videos of this event, you should – seeing the IBM system fluidly answer very tricky questions is amazing. There is no sense that it is parsing the question and then sorting through 200 – 300 million pages of data per second in the background as it assembles its answers. This is truly the computer industry at its best. IBM lived up to its brand image as the oldest and strongest technology company and showed us a potential for integrating computers into untapped new potential solutions. Since the Jeopardy event, IBM has been working on commercializing Watson with an eye toward delivering domain-specific expert advisors. I recently listened to a presentation by a doctor participating in the trials of a Watson medical assistant, and the results were startling in terms of the potential to assist medical professionals in diagnostic procedures.
There has been a lot of ill-considered press coverage about the “death” of UNIX and coverage of the wholesale migration of UNIX workloads to LINUX, some of which (the latter, not the former) I have contributed to. But to set the record straight, the extinction of UNIX is not going to happen in our lifetime.
While UNIX revenues are not growing at any major clip, it appears as if they have actually had a slight uptick over the past year, probably due to a surge by IBM, and seem to be nicely stuck around the $18 - 20B level annual range. But what is important is the “why,” not the exact dollar figure.
UNIX on proprietary RISC architectures will stay around for several reasons that primarily revolve around their being the only close alternative to mainframes in regards to specific high-end operational characteristics:
Performance – If you need the biggest single-system SMP OS image, UNIX is still the only realistic commercial alternative other than mainframes.
Isolated bulletproof partitionability – If you want to run workload on dynamically scalable and electrically isolated partitions with the option to move workloads between them while running, then UNIX is your answer.
Near-ultimate availability – If you are looking for the highest levels of reliability and availability ex mainframes and custom FT systems, UNIX is the answer. It still possesses slight availability advantages, especially if you factor in the more robust online maintenance capabilities of the leading UNIX OS variants.
Last year I wrote about Oracle’s new plans for SPARC, anchored by a new line of SPARC CPUs engineered in conjunction with Fujitsu (Does SPARC have a Future?), and commented that the first deliveries of this new technology would probably be in early 2012, and until we saw this tangible evidence of Oracle’s actual execution of this road map we could not predict with any confidence the future viability of SPARC.
The T4 CPU
Fast forward a year and Oracle has delivered the first of the new CPUs, ahead of schedule and with impressive gains in performance that make it look like SPARC will remain a viable platform for years. Specifically, Oracle has introduced the T4 CPU and systems based on them. The T4, an evolution of Oracle’s highly threaded T-Series architecture, is implemented with an entirely new core that will form the basis, with variations in number of threads versus cores and cache designs, of the future M and T series systems. The M series will have fewer threads and more performance per thread, while the T CPUs will, like their predecessors, emphasize throughput for highly threaded workloads. The new T4 will have 8 cores, and each core will have 8 threads. While the T4 emphasizes highly threaded workload performance, it is important to note that Oracles has radically improved single-thread performance over its predecessors, with Oracle claiming performance per thread improvements of 5X over its predecessors, greatly improving its utility as a CPU to power less thread-intensive workloads as well.
At the Hot Chips conference last week, Intel disclosed additional details about the upcoming Poulson Itanium CPU due for shipment early next year. For Itanium loyalists (essentially committed HP-UX customers) the disclosures are a ray of sunshine among the gloomy news that has been the lot of Itanium devotees recently.
Poulson will bring several significant improvements to Itanium in both performance and reliability. On the performance side, we have significant improvements on several fronts:
Process – Poulson will be manufactured with the same 32 nm semiconductor process that will (at least for a while) be driving the high-end Xeon processors. This is goodness all around – performance will improve and Intel now can load its latest production lines more efficiently.
More cores and parallelism – Poulson will be an 8-core processor with a whopping 54 MB of on-chip cache, and Intel has doubled the width of the multi-issue instruction pipeline, from 6 to 12 instructions. Combined with improved hyperthreading, the combination of 2X cores and 2X the total number of potential instructions executed per clock cycle by each core hints at impressive performance gains.
Architecture and instruction tweaks – Intel has added additional instructions based on analysis of workloads. This kind of tuning of processor architectures seldom results in major gains in performance, but every small increment helps.
Oracle announced today that it is going to cease development for Itanium across its product line, stating that itbelieved, after consultation with Intel management, that x86 was Intel’s strategic platform. Intel of course responded with a press release that specifically stated that there were at least two additional Itanium products in active development – Poulsen (which has seen its initial specifications, if not availability, announced), and Kittson, of which little is known.
This is a huge move, and one that seems like a kick carefully aimed at the you know what’s of HP’s Itanium-based server business, which competes directly with Oracle’s SPARC-based Unix servers. If Oracle stays the course in the face of what will certainly be immense pressure from HP, mild censure from Intel, and consternation on the part of many large customers, the consequences are pretty obvious:
Intel loses prestige, credibility for Itanium, and a potential drop-off of business from its only large Itanium customer. Nonetheless, the majority of Intel’s server business is x86, and it will, in the end, suffer only a token loss of revenue. Intel’s response to this move by Oracle will be muted – public defense of Itanium, but no fireworks.