The Background – Linux as a Fast Follower and the Need for Hot Patching
No doubt about it, Linux has made impressive strides in the last 15 years, gaining many features previously associated with high-end proprietary Unix as it made the transition from small system plaything to core enterprise processing resource and the engine of the extended web as we know it. Along the way it gained reliable and highly scalable schedulers, a multiplicity of efficient and scalable file systems, advanced RAS features, its own embedded virtualization and efficient thread support.
As Linux grew, so did supporting hardware, particularly the capabilities of the ubiquitous x86 CPU upon which the vast majority of Linux runs today. But the debate has always been about how close Linux could get to "the real OS", the core proprietary Unix variants that for two decades defined the limits of non-mainframe scalability and reliability. But "the times they are a changing", and the new narrative may be "when will Unix catch up to Linux on critical RAS features like hot patching".
Hot patching, the ability to apply updates to the OS kernel while it is running, is a long sought-after but elusive feature of a production OS. Long sought after because both developers and operations teams recognize that bringing down an OS instance that is doing critical high-volume work is at best disruptive and worst a logistical nightmare, and elusive because it is incredibly difficult. There have been several failed attempts, and several implementations that "almost worked" but were so fraught with exceptions that they were not really useful in production.[i]
[Apologies to all who have just read this post with a sense of deja-vue. I saw a typo, corrected it and then republished the blog, and it reset the publication date. This post was originally published several months ago.]
Having been away from the Linux scene for a while, I recently took a look at a newer version of Linux, SUSE Enterprise Linux Version 11.3, which is representative of the latest feature sets from the Linux 3.0 et seq kernel available to the entre Linux community, including SUSE, Red Hat, Canonical and others. It is apparent, both from the details on SUSE 11.3 and from perusing the documentation on other distribution providers, that Linux has continued to mature nicely as both a foundation for large scale-out clouds as well as a strong contender for the kind of enterprise workloads that previously were only comfortable on either RISC/UNIX systems or large Microsoft Server systems. In effect, Linux has continued its maturation to the point where its feature set and scalability begin to look like a top-tier UNIX from only a couple of years ago.
Among the enterprise technology that caught my eye:
Scalability – The Linux kernel now scales to 4096 x86 CPUs and up to 16 TB of memory, well into high-end UNIX server territory, and will support the largest x86 servers currently shipping.
I/O – The Linux kernel now includes btrfs (a geeky contraction of “Better File System), an open source file system that promises much of the scalability and feature set of Oracle’s popular ZFS file system including checksums, CoW, snapshotting, advanced logical volume management including thin provisioning and others. The latest releases also include advanced features like geoclustering and remote data replication to support advanced HA topologies.
On April 23, IBM rolled out the long-awaited POWER8 CPU, the successor to POWER7+, and given the extensive pre-announcement speculation, the hardware itself was no big surprise (the details are fascinating, but not suitable for this venue), offering an estimated 30 - 50% improvement in application performance over the latest POWER7+, with potential for order of magnitude improvements with selected big data and analytics workloads. While the technology is interesting, we are pretty numb to the “bigger, better, faster” messaging that inevitably accompanies new hardware announcements, and the real impact of this announcement lies in its utility for current AIX users and IBM’s increased focus on Linux and its support of the OpenPOWER initiative.
OK, so we’re numb, but it’s still interesting. POWER8 is an entirely new processor generation implemented in 22 nm CMOS (the same geometry as Intel’s high-end CPUs). The processor features up to 12 cores, each with up to 8 threads, and a focus on not only throughput but high performance per thread and per core for low-thread-count applications. Added to the mix is up to 1 TB of memory per socket, massive PCIe 3 I/O connectivity and Coherent Accelerator Processor Interface (CAPI), IBM’s technology to deliver memory-controller-based access for accelerators and flash memory in POWER systems. CAPI figures prominently in IBM’s positioning of POWER as the ultimate analytics engine, with the announcement profiling the performance of a configuration using 40 TB of CAPI-attached flash for huge in-memory analytics at a fraction of the cost of a non-CAPI configuration.[i]
A Slam-dunk for AIX users and a new play for Linux
As the new year looms, thoughts turn once again to our annual reading of the tea leaves, in this case focused on what I see coming in server land. We’ve just published the full report, Predictions for 2014: Servers & Data Centers, but as teaser, here are a few of the major highlights from the report:
1. Increasing choices in form factor and packaging – I&O pros will have to cope with a proliferation of new form factors, some optimized for dense low-power cloud workloads, some for general purpose legacy IT, and some for horizontal VM clusters (or internal cloud if you prefer). These will continue to appear in an increasing number of variants.
2. ARM – Make or break time is coming, depending on the success of coming 64-bit ARM CPU/SOC designs with full server feature sets including VM support.
3. The beat goes on – Major turn of the great wheel coming for server CPUs in early 2014.
4. Huge potential disruption in flash architecture – Introduction of flash in main memory DIMM slots has the potential to completely disrupt how flash is used in storage tiers, and potentially can break the current storage tiering model, initially physically with the potential to ripple through memory architectures.
When I returned to Forrester in mid-2010, one of the first blog posts I wrote was about Oracle’s new roadmap for SPARC and Solaris, catalyzed by numerous client inquiries and other interactions in which Oracle’s real level of commitment to future SPARC hardware was the topic of discussion. In most cases I could describe the customer mood as skeptical at best, and panicked and committed to migration off of SPARC and Solaris at worst. Nonetheless, after some time spent with Oracle management, I expressed my improved confidence in the new hardware team that Oracle had assembled and their new roadmap for SPARC processors after the successive debacles of the UltraSPARC-5 and Rock processors under Sun’s stewardship.
Two and a half years later, it is obvious that Oracle has delivered on its commitments regarding SPARC and is continuing its investments in SPARC CPU and system design as well as its Solaris OS technology. The latest evolution of SPARC technology, the SPARC T5 and the soon-to-be-announced M5, continue the evolution and design practices set forth by Oracle’s Rick Hetherington in 2010 — incremental evolution of a common set of SPARC cores, differentiation by variation of core count, threads and cache as opposed to fundamental architecture, and a reliable multi-year performance progression of cores and system scalability.
Nathan Bedford Forrest, a Confederate general of despicable ideology and consummate tactics, spoke of “keepin up the skeer,” applying continued pressure to opponents to prevent them from regrouping and counterattacking. POWER7+, the most recent version of IBM’s POWER architecture, anticipated as a follow-up to the POWER7 for almost a year, was finally announced this week, and appears to be “keepin up the skeer” in terms of its competitive potential for IBM POWER-based systems. In short, it is a hot piece of technology that will keep existing IBM users happy and should help IBM maintain its impressive momentum in the Unix systems segment.
For the chip heads, the CPU is implemented in a 32 NM process, the same as Intel’s upcoming Poulson, and embodies some interesting evolutions in high-end chip design, including:
Use of DRAM instead of SRAM — IBM has pioneered the use of embedded DRAM (eDRAM) as embedded L3 cache instead of the more standard and faster SRAM. In exchange for the loss of speed, eDRAM requires fewer transistors and lower power, allowing IBM to pack a total of 80 MB (a lot) of shared L3 cache, far more than any other product has ever sported.
Over the last couple of years, IBM, despite having a rich internal technology ecosystem and a number of competitive blade and CI offerings, has not had a comprehensive integrated offering to challenge HP’s CloudSystem Matrix and Cisco’s UCS. This past week IBM effectively silenced its critics and jumped to the head of the CI queue with the announcement of two products, PureFlex and PureApplication, the results of a massive multi-year engineering investment in blade hardware, systems management, networking, and storage integration. Based on a new modular blade architecture and new management architecture, the two products are really more of a continuum of a product defined by the level of software rather than two separate technology offerings.
PureFlex is the base product, consisting of the new hardware (which despite having the same number of blades as the existing HS blade products, is in fact a totally new piece of hardware), which integrates both BNT-based networking as well as a new object-based management architecture which can manage up to four chassis and provide a powerful setoff optimization, installation, and self-diagnostic functions for the hardware and software stack up to and including the OS images and VMs. In addition IBM appears to have integrated the complete suite of Open Fabric Manager and Virtual Fabric for remapping MAC/WWN UIDs and managing VM networking connections, and storage integration via the embedded V7000 storage unit, which serves as both a storage pool and an aggregation point for virtualizing external storage. The laundry list of features and functions is too long to itemize here, but PureFlex, especially with its hypervisor-neutrality and IBM’s Cloud FastStart option, is a complete platform for an enterprise private cloud or a horizontal VM compute farm, however you choose to label a shared VM utility.
There has been a lot of ill-considered press coverage about the “death” of UNIX and coverage of the wholesale migration of UNIX workloads to LINUX, some of which (the latter, not the former) I have contributed to. But to set the record straight, the extinction of UNIX is not going to happen in our lifetime.
While UNIX revenues are not growing at any major clip, it appears as if they have actually had a slight uptick over the past year, probably due to a surge by IBM, and seem to be nicely stuck around the $18 - 20B level annual range. But what is important is the “why,” not the exact dollar figure.
UNIX on proprietary RISC architectures will stay around for several reasons that primarily revolve around their being the only close alternative to mainframes in regards to specific high-end operational characteristics:
Performance – If you need the biggest single-system SMP OS image, UNIX is still the only realistic commercial alternative other than mainframes.
Isolated bulletproof partitionability – If you want to run workload on dynamically scalable and electrically isolated partitions with the option to move workloads between them while running, then UNIX is your answer.
Near-ultimate availability – If you are looking for the highest levels of reliability and availability ex mainframes and custom FT systems, UNIX is the answer. It still possesses slight availability advantages, especially if you factor in the more robust online maintenance capabilities of the leading UNIX OS variants.
Last year I wrote about Oracle’s new plans for SPARC, anchored by a new line of SPARC CPUs engineered in conjunction with Fujitsu (Does SPARC have a Future?), and commented that the first deliveries of this new technology would probably be in early 2012, and until we saw this tangible evidence of Oracle’s actual execution of this road map we could not predict with any confidence the future viability of SPARC.
The T4 CPU
Fast forward a year and Oracle has delivered the first of the new CPUs, ahead of schedule and with impressive gains in performance that make it look like SPARC will remain a viable platform for years. Specifically, Oracle has introduced the T4 CPU and systems based on them. The T4, an evolution of Oracle’s highly threaded T-Series architecture, is implemented with an entirely new core that will form the basis, with variations in number of threads versus cores and cache designs, of the future M and T series systems. The M series will have fewer threads and more performance per thread, while the T CPUs will, like their predecessors, emphasize throughput for highly threaded workloads. The new T4 will have 8 cores, and each core will have 8 threads. While the T4 emphasizes highly threaded workload performance, it is important to note that Oracles has radically improved single-thread performance over its predecessors, with Oracle claiming performance per thread improvements of 5X over its predecessors, greatly improving its utility as a CPU to power less thread-intensive workloads as well.
On June 15, HP announced that it had filed suit against Oracle, saying in a statement:
“HP is seeking the court’s assistance to compel Oracle to:
Reverse its decision to discontinue all software development on the Itanium platform
Reaffirm its commitment to offer its product suite on HP platforms, including Itanium;
Immediately reset the Itanium core processor licensing factor consistent with the model prior to December 1, 2010 for RISC/EPIC systems
HP also seeks:
Injunctive relief, including an order prohibiting Oracle from making false and misleading statements regarding the Itanium microprocessor or HP’s Itanium-based servers and remedying the harm caused by Oracle’s conduct.
Damages and fees and other standard remedies available in cases of this nature.”