I’ve recently had the opportunity to talk with a small sample of SLES 11 and RH 6 Linux users, all developing their own applications. All were long-time Linux users, and two of them, one in travel services and one in financial services, had applications that can be described as both large and mission-critical.
The overall message is encouraging for Linux advocates, both the calm rational type as well as those who approach it with near-religious fervor. The latest releases from SUSE and Red Hat, both based on the 2.6.32 Linux kernel, show significant improvements in scalability and modest improvements in iso-configuration performance. One user reported that an application that previously had maxed out at 24 cores with SLES 10 was now nearing production certification with 48 cores under SLES 11. Performance scalability was reported as “not linear, but worth doing the upgrade.”
Overall memory scalability under Linux is still a question mark, since the widely available x86 platforms do not exceed 3 TB of memory, but initial reports from a user familiar with HP’s DL 980 verify that the new Linux Kernel can reliably manage at least 2TB of RAM under heavy load.
File system options continue to expand as well. The older Linux FS standard, ETX4, which can scale to “only” 16 TB, has been joined by additional options such as XFS (contributed by SGI), which has been implemented in several installations with file systems in excess of 100 TB, relieving a limitation that may have been more psychological than practical for most users.
On Dec. 2, Oracle announced the next move in its program to integrate its hardware and software assets, with the introduction of Oracle Private Cloud Architecture, an integrated infrastructure stack with Infiniband and/or 10G Ethernet fabric, integrated virtualization, management and servers along with software content, both Oracle’s and customer-supplied. Oracle has rolled out the architecture as a general platform for a variety of cloud environments, along with three specific implementations, Exadata, Exalogic and the new Sunrise Supercluster, as proof points for the architecture.
Exadata has been dealt with extensively in other venues, both inside Forrester and externally, and appears to deliver the goods for I&O groups who require efficient consolidation and maximum performance from an Oracle database environment.
Exalogic is a middleware-targeted companion to the Exadata hardware architecture (or another instantiation of Oracle’s private cloud architecture, depending on how you look at it), presenting an integrated infrastructure stack ready to run either Oracle or third-party apps, although Oracle is positioning it as a Java middleware platform. It consists of the following major components integrated into a single rack:
Oracle x86 or T3-based servers and storage.
Oracle Quad-rate Infiniband switches and the Oracle Solaris gateway, which makes the Infiniband network look like an extension of the enterprise 10G Ethernet environment.
Oracle Linux or Solaris.
Oracle Enterprise Manager Ops Center for management.
Oracle recently announced the availability of Solaris 11 Express, the first iteration of its Solaris 11 product cycle. The feature set of this release is along the lines promised by Oracle at their August analyst event this year, including:
Scalability enhancements to set it up for future systems with higher core counts and requirements to schedule large numbers of threads.
Improvements to zFS, Oracle’s highly scalable file system.
Reduction of boot times to the range of 10 seconds — a truly impressive accomplishment.
Optimizations to support Oracle Exadata and Exalogic integrated solutions. While some of these changes may be very specific to Oracle’s stack, most of them are almost certain to improve any application that requires some combination of high thread counts, large memory and low-latency communications with either 10G Ethernet or Infiniband.
Improvements in availability due to reductions on the number of reboot scenarios, improvements in patching and improved error recovery. This is hard to measure, but Oracle claims they are close to an OS which does not need to come down for normal maintenance, a goal of all of the major UNIX vendors and long a signature of mainframe environments.
I have been working on a research document, to be published this quarter, on the impact of 8-socket x86 servers based on Intel’s new Xeon 7500 CPU. In a nutshell, these systems have the performance of the best-of-breed RISC/UNIX systems of three years ago, at a substantially better price, and their overall performance improvement trajectory has been steeper than competing technologies for the past decade.
This is probably not shocking news and is not the subject of this current post, although I would encourage you to read it when it is finally published. During the course of researching this document I spent time trying to prove or disprove my thesis that x86 system performance solidly overlapped that of RISC/UNIX with available benchmark results. The process highlighted for me the limitations of using standardized benchmarks for performance comparisons. There are now so many benchmarks available that system vendors are only performing each benchmark on selected subsets of their product lines, if at all. Additionally, most benchmarks suffer from several common flaws:
They are results from high-end configurations, in many cases far beyond the norm for any normal use cases, but results cannot be interpolated to smaller, more realistic configurations.
They are often the result of teams of very smart experts tuning the system configurations, application and system software parameters for optimal results. For a large benchmark such as SAP or TPC, it is probably reasonable to assume that there are over 1,000 variables involved in the tuning effort. This makes the results very much like EPA mileage figures — the consumer is guaranteed not to exceed these numbers.
There has been a lot of press about IBM’s acquisition of BNT (Blade Network Technologies) focusing on the economics and market share of BNT as a competitor to Cisco and HP’s ProCurve/3Com franchise. But at its heart the acquisition is more about defending and expanding a position in the emerging converged server, networking, and storage infrastructure segment than it is about raw switch port market share. It is also a powerful vindication of the proposition that infrastructure convergence is driving major realignment in the vendor industry.
Starting with HP’s success with its c-Class blade servers and Virtual Connect technology, and escalating with Cisco’s entrance into the server market, IBM continued its investment in its Virtual Fabric and Open Fabric Manager technology, heavily leveraging BNT’s switch platforms. At some point it became clear that BNT was a critical element of IBM’s convergence strategy, with IBM’s plans now heavily dependent on a vendor with whom they had an excellent, but non-exclusive relationship, and one whose acquisition by another player could severely compromise their product plans. Hence the acquisition. Now that it owns BNT, IBM can capitalize on its excellent edge network technology for further development of its converged infrastructure strategy without hesitation about further leveraging BNT’s technology.
I recently spent a day with IBM’s x86 team, primarily to get back up to speed on their entire x86 product line, and partially to realign my views of them after spending almost five years as a direct competitor. All in all, time well spent, with some key takeaways:
IBM has fixed some major structural problems with the entire x86 program and it perception in the company – As recently as two years ago, it appeared to the outside world that IBM was not really serious about x86 servers. Between licensing its low-end server designs to Lenovo (although IBM continued to sell its own versions) and an apparent retreat to the upper-end of the segment, it appeared that IBM was not serious about x86 severs. New management, new alignment with sales, and a higher internal profile for x86 seems to have moved the division back into IBM’s mainstream.
Increased investment – It looks like IBM significantly ramped up investments in x86 products about three years ago. The result has been a relatively steady flow of new products into the marketplace, some of which, such as the HS22 blade, significantly reversed deficits versus equivalent HP products. Others followed in high-end servers, virtualization and systems management, and increased velocity of innovation in low-end systems.
Established leadership in new niches such as dense modular server deployments – IBM’s iDataplex, while representing a small footprint in terms of their total volume, gave them immediate visibility as an innovator in the rapidly growing niche for hyper scale dense deployments. Along the way, IBM has also apparently become the leader in GPU deployments as well, another low-volume but high-visibility niche.