Recently we’ve had a chance to look again at two very conflicting views from HP and Facebook on how to do web-scale and cloud computing, both announced at the recent OCP annual event in California.
From HP come its new CloudLine systems, the public face of their joint venture with Foxcon. Early details released by HP show a line of cost-optimized servers descended from a conventional engineering lineage and incorporating selected bits of OCP technology to reduce costs. These are minimalist rack servers designed, after stripping away all the announcement verbiage, to compete with white-box vendors such as Quanta, SuperMicro and a host of others. Available in five models ranging from the minimally-featured CL1100 up through larger nodes designed for high I/O, big data and compute-intensive workloads, these systems will allow large installations to install capacity at costs ranging from 10 – 25% less than the equivalent capacity in their standard ProLiant product line. While the strategic implications of HP having to share IP and market presence with Foxcon are still unclear, it is a measure of HP’s adaptability that they were willing to execute on this arrangement to protect against inroads from emerging competition in the most rapidly growing segment of the server market, and one where they have probably been under immense margin pressure.
Intel has made no secret of its development of the Xeon D, an SOC product designed to take Xeon processing close to power levels and product niches currently occupied by its lower-power and lower performance Atom line, and where emerging competition from ARM is more viable.
The new Xeon D-1500 is clear evidence that Intel “gets it” as far as platforms for hyperscale computing and other throughput per Watt and density-sensitive workloads, both in the enterprise and in the cloud are concerned. The D1500 breaks new ground in several areas:
It is the first Xeon SOC, combining 4 or 8 Xeon cores with embedded I/O including SATA, PCIe and multiple 10 nd 1 Gb Ethernet ports.
It is the first of Intel’s 14 nm server chips expected to be introduced this year. This expected process shrink will also deliver a further performance and performance per Watt across the entire line of entry through mid-range server parts this year.
Why is this significant?
With the D-1500, Intel effectively draws a very deep line in the sand for emerging ARM technology as well as for AMD. The D1500, with 20W – 45W power, delivers the lower end of Xeon performance at power and density levels previously associated with Atom, and close enough to what is expected from the newer generation of higher performance ARM chips to once again call into question the viability of ARM on a pure performance and efficiency basis. While ARM implementations with embedded accelerators such as DSPs may still be attractive in selected workloads, the availability of a mainstream x86 option at these power levels may blunt the pace of ARM design wins both for general-purpose servers as well as embedded designs, notably for storage systems.
We have been watching many variants on efficient packaging of servers for highly scalable workloads for years, including blades, modular servers, and dense HPC rack offerings from multiple vendors, most of the highly effective, and all highly proprietary. With the advent of Facebook’s Open Compute Project, the table was set for a wave of standardized rack servers and the prospect of very cost-effective rack-scale deployments of very standardized servers. But the IP for intelligently shared and managed power and cooling at a rack level needed a serious R&D effort that the OCP community, by and large, was unwilling to make. Into this opportunity stepped Intel, which has been quietly working on its internal Rack Scale Architecture (RSA) program for the last couple of years, and whose first product wave was officially outed recently as part of an announcement by Intel and Ericsson.
While not officially announcing Intel’s product nomenclature, Ericsson announced their “HDS 8000” based on Intel’s RSA, and Intel representatives then went on to explain the fundamental of RSA, including a view of the enhancements coming this year.
RSA is a combination of very standardized x86 servers, a specialized rack enclosure with shared Ethernet switching and power/cooling, and layers of firmware to accomplish a set of tasks common to managing a rack of servers, including:
· Asset discovery
· Switch setup and management
· Power and cooling management across the servers with the rack
Last year I published a reasonably well-received research document on Hadoop infrastructure, “Building the Foundations for Customer Insight: Hadoop Infrastructure Architecture”. Now, less than a year later it’s looking obsolete, not so much because it was wrong for traditional (and yes, it does seem funny to use a word like “traditional” to describe a technology that itself is still rapidly evolving and only in mainstream use for a handful of years) Hadoop, but because the universe of analytics technology and tools has been evolving at light-speed.
If your analytics are anchored by Hadoop and its underlying map reduce processing, then the mainstream architecture described in the document, that of clusters of servers each with their own compute and storage, may still be appropriate. On the other hand, if, like many enterprises, you are adding additional analysis tools such as NoSQL databases, SQL on Hadoop (Impala, Stinger, Vertica) and particularly Spark, an in-memory-based analytics technology that is well suited for real-time and streaming data, it may be necessary to begin reassessing the supporting infrastructure in order to build something that can continue to support Hadoop as well as cater to the differing access patterns of other tools sets. This need to rethink the underlying analytics plumbing was brought home by a recent demonstration by HP of a reference architecture for analytics, publicly referred to as the HP Big Data Reference Architecture.
There is always a tendency to regard the major players in large markets as being a static background against which the froth of smaller companies and the rapid dance of customer innovation plays out. But if we turn our lens toward the major server vendors (who are now also storage and networking as well as software vendors), we see that the relatively flat industry revenues hide almost continuous churn. Turn back the clock slightly more than five years ago, and the market was dominated by three vendors, HP, Dell and IBM. In slightly more than five years, IBM has divested itself of highest velocity portion of its server business, Dell is no longer a public company, Lenovo is now a major player in servers, Cisco has come out of nowhere to mount a serious challenge in the x86 server segment, and HP has announced that it intends to split itself into two companies.
And it hasn’t stopped. Two recent events, the fracturing of the VCE consortium and the formerly unthinkable hook-up of IBM and Cisco illustrate the urgency with which existing players are seeking differential advantage, and reinforce our contention that the whole segment of converged and integrated infrastructure remains one of the active and profitable segments of the industry.
EMC’s recent acquisition of Cisco’s interest in VCE effectively acknowledged what most customers have been telling us for a long time – that VCE had become essentially an EMC-driven sales vehicle to sell storage, supported by VMware (owned by EMC) and Cisco as a systems platform. EMC’s purchase of Cisco’s interest also tacitly acknowledges two underlying tensions in the converged infrastructure space:
Dell today announced its new FX system architecture, and I am decidedly impressed.
Dell FX is a 2U flexible infrastructure building block that allows infrastructure architects to compose an application-appropriate server and storage infrastructure out of the following set of resources:
Multiple choices of server nodes, ranging from multi-core Atom to new Xeon E5 V3 servers. With configurations ranging from 2 to 16 server nodes per enclosure, there is pretty much a configuration point for most mainstream applications.
A novel flexible method of mapping disks from up to three optional disk modules, each with 16 drives - the mapping, controlled by the onboard management, allows each server to appear as if the disk is locally attached DASD, so no changes are needed in any software that thinks it is accessing local storage. A very slick evolution in storage provisioning.
A set of I/O aggregators for consolidating Ethernet and FC I/O from the enclosure.
All in all, an attractive and flexible packaging scheme for infrastructure that needs to be tailored to specific combinations of server, storage and network configurations. Probably an ideal platform to support the Nutanix software suite that Dell is reselling as well. My guess is that other system design groups are thinking along these lines, but this is now a pretty unique package, and merits attention from infrastructure architects.
On October 20 at TechEd, Microsoft quietly slipped in what looks like a potential game-changing announcement in the private/hybrid cloud world when they rolled out Microsoft Cloud Platform System (CPS), an integrated hardware/software system that combines an Azure-consistent on premise cloud with an optimized hardware stack from Dell.
While the timing of the event comes as a surprise, the fact that IBM has decided to unload its technically excellent but unprofitable semiconductor manufacturing operation does not, nor does its choice of Globalfoundries, with whom it has had a longstanding relationship.
Very much in the shadows of all the press coverage and hysteria attendant on emerging cloud architectures and customer-facing systems of engagement are the nitty-gritty operational details that lurk like monsters in the swamp of legacy infrastructure, and some of them have teeth. And sometimes these teeth can really take a bite out of the posterior of an unprepared organization.
One of those toothy animals that I&O groups are increasingly encountering in their landscapes is the problem of what to do with Windows Server 2003 (WS2003). It turns out there are still approximately 11 million WS2003 systems running today, with another 10+ million instances running as VM guests. Overall, possibly more than 22 million OS images and a ton of hardware that will need replacing and upgrading. And increasing numbers of organizations have finally begun to take seriously the fact that Microsoft is really going to end support and updates as of July 2015.
Based on the conversations I have been having with our clients, the typical I&O group that is now scrambling to come up with a plan has not been willfully negligent, nor are they stupid. Usually WS2003 servers are legacy servers, quietly running some mature piece of code, often in satellite locations or in the shops of acquired companies. The workloads are a mix of ISV and bespoke code, but it is often a LOB-specific application, with the run-of-the-mill collaboration, infrastructure servers and, etc. having long since migrated to newer platforms. A surprising number of clients have told me that they have identified the servers, but not always the applications or the business owners – often a complex task for an old resource in a large company.
[Apologies to all who have just read this post with a sense of deja-vue. I saw a typo, corrected it and then republished the blog, and it reset the publication date. This post was originally published several months ago.]
Having been away from the Linux scene for a while, I recently took a look at a newer version of Linux, SUSE Enterprise Linux Version 11.3, which is representative of the latest feature sets from the Linux 3.0 et seq kernel available to the entre Linux community, including SUSE, Red Hat, Canonical and others. It is apparent, both from the details on SUSE 11.3 and from perusing the documentation on other distribution providers, that Linux has continued to mature nicely as both a foundation for large scale-out clouds as well as a strong contender for the kind of enterprise workloads that previously were only comfortable on either RISC/UNIX systems or large Microsoft Server systems. In effect, Linux has continued its maturation to the point where its feature set and scalability begin to look like a top-tier UNIX from only a couple of years ago.
Among the enterprise technology that caught my eye:
Scalability – The Linux kernel now scales to 4096 x86 CPUs and up to 16 TB of memory, well into high-end UNIX server territory, and will support the largest x86 servers currently shipping.
I/O – The Linux kernel now includes btrfs (a geeky contraction of “Better File System), an open source file system that promises much of the scalability and feature set of Oracle’s popular ZFS file system including checksums, CoW, snapshotting, advanced logical volume management including thin provisioning and others. The latest releases also include advanced features like geoclustering and remote data replication to support advanced HA topologies.