This week at ISSCC, Intel made its first detailed public disclosures about its upcoming “Poulson” next-generation Itanium CPU. While not in any sense complete, the details they did disclose paint a picture of a competent product that will continue to keep the heat on in the high-end UNIX systems market. Highlights include:
Process — Poulson will be produced in a 32 nm process, skipping the intermediate 45 nm step that many observers expected to see as a step down from the current 65 nm Itanium process. This is a plus for Itanium consumers, since it allows for denser circuits and cheaper chips. With an industry record 3.1 billion transistors, Poulson needs all the help it can get keeping size and power down. The new process also promises major improvements in power efficiency.
Cores and cache — Poulson will have 8 cores and 54 MB of on-chip cache, a huge amount, even for a cache-sensitive architecture like Itanium. Poulson will have a 12-issue pipeline instead of the current 6-issue pipeline, promising to extract more performance from existing code without any recompilation.
Compatibility — Poulson is socket- and pin-compatible with the current Itanium 9300 CPU, which will mean that HP can move more quickly into production shipments when it's available.
Since its introduction of its Core 2 architecture, Intel reversed much of the damage done to it by AMD in the server space, with attendant publicity. AMD, however, has been quietly reclaiming some ground with its 12-core 6100 series CPUs, showing strength in benchmarks that emphasize high throughput in process-rich environments as opposed to maximum performance per core. Several AMD-based system products have also been cited by their manufacturers to us as enjoying very strong customer acceptance due to the throughput of the 12-core CPUs combined with their attractive pricing. As a fillip to this success, AMD this past week announced speed bumps for the 6100-series products to give a slight performance boost as they continue to compete with Intel’s Xeon 5600 and 7500 products (Intel’s Sandy Bridge server products have not yet been announced).
But the real news last week was the quiet subtext that the anticipated 16-core Interlagos products based on the new Bulldozer core appear to be on schedule for Q2 ’11 shipments system partners, who should probably be able to ship systems during Q3, and that AMD is still certifying them as compatible with the current sockets used for the 12-core 6000 CPUs. This implies that system partners will be able to quickly deliver products based on the new parts very rapidly.
Actual performance of these systems will obviously be dependent on the workloads being run, but our gut feeling is that while they will not rival the per-core performance of the Intel Xeon 7500 CPUs, for large throughput-oriented environments with high numbers of processes, a description that fits a large number of web and middleware environments, these CPUs, each with up to a 50% performance advantage per core over the current AMD CPUs, may deliver some impressive benchmarks and keep the competition in the server space at a boil, which in the end is always helpful to customers.
One evening in 1972 I was hanging out in the computer science department at UC Berkeley with a couple of equally socially backward friends waiting for our batch programs to run, and to kill some time we dropped in on a nearby physics lab that was analyzing photographs of particle tracks from one of the various accelerators that littered the Lawrence Radiation Laboratory. Analyzing these tracks was real scut work – the overworked grad student had to measure angles between tracks, length of tracks, and apply a number of calculations to them to determine if they were of interest. To our surprise, this lab had something we had never seen before – a computer-assisted screening device that scanned the photos and in a matter of seconds determined it had any formations that were of interest. It had a big light table, a fancy scanner, whirring arms and levers and gears, and off in the corner, the computer, “a PDP from Digital Equipment.” It was a 19” rack mount box with an impressive array of lights and switches on the front. As a programmer of the immense 1 MFLOP CDC 6400 in the Rad Lab computer center, I was properly dismissive…
This was a snapshot of the dawn of the personal computer era, almost a decade before IBM Introduced the PC and blew it wide open. The PDP (Programmable Data Processor) systems from MIT Professor Ken Olsen were the beginning of the fundamental change in the relationship between man and computer, putting a person in the computing loop instead of keeping them standing outside the temple.
From nothing more than an outlandish speculation, the prospects for a new entrant into the volume Linux and Windows server space have suddenly become much more concrete, culminating in an immense buzz at CES as numerous players, including NVIDIA and Microsoft, stoked the fires with innuendo, announcements, and demos.
Consumers of x86 servers are always on the lookout for faster, cheaper, and more power-efficient servers. In the event that they can’t get all three, the combination of cheaper and more energy-efficient seems to be attractive to a large enough chunk of the market to have motivated Intel, AMD, and all their system partners to develop low-power chips and servers designed for high density compute and web/cloud environments. Up until now the debate was Intel versus AMD, and low power meant a CPU with four cores and a power dissipation of 35 – 65 Watts.
The Promised Land
The performance trajectory of processors that were formerly purely mobile device processors, notably the ARM Cortex, has suddenly introduced a new potential option into the collective industry mindset. But is this even a reasonable proposition, and if so, what does it take for it to become a reality?
Our first item of business is to figure out whether or not it even makes sense to think about these CPUs as server processors. My quick take is yes, with some caveats. The latest ARM offering is the Cortex A9, with vendors offering dual core products at up to 1.2 GHz currently (the architecture claims scalability to four cores and 2 GHz). It draws approximately 2W, much less than any single core x86 CPU, and a multi-core version should be able to execute any reasonable web workload. Coupled with the promise of embedded GPUs, the notion of a server that consumes much less power than even the lowest power x86 begins to look attractive. But…
I’ve recently had the opportunity to talk with a small sample of SLES 11 and RH 6 Linux users, all developing their own applications. All were long-time Linux users, and two of them, one in travel services and one in financial services, had applications that can be described as both large and mission-critical.
The overall message is encouraging for Linux advocates, both the calm rational type as well as those who approach it with near-religious fervor. The latest releases from SUSE and Red Hat, both based on the 2.6.32 Linux kernel, show significant improvements in scalability and modest improvements in iso-configuration performance. One user reported that an application that previously had maxed out at 24 cores with SLES 10 was now nearing production certification with 48 cores under SLES 11. Performance scalability was reported as “not linear, but worth doing the upgrade.”
Overall memory scalability under Linux is still a question mark, since the widely available x86 platforms do not exceed 3 TB of memory, but initial reports from a user familiar with HP’s DL 980 verify that the new Linux Kernel can reliably manage at least 2TB of RAM under heavy load.
File system options continue to expand as well. The older Linux FS standard, ETX4, which can scale to “only” 16 TB, has been joined by additional options such as XFS (contributed by SGI), which has been implemented in several installations with file systems in excess of 100 TB, relieving a limitation that may have been more psychological than practical for most users.
In October, with great fanfare, the Open Data Center Alliance unfurled its banners. The ODCA is a consortium of approximately 50 large IT consumers, including large manufacturing, hosting and telecomm providers, with the avowed intent of developing standards for interoperable cloud computing. In addition to the roster of users, the announcement highlighted Intel with an ambiguous role as a technology advisor to the group. The ODCA believes that it will achieve some weight in the industry due to its estimated $50 billion per year of cumulative IT purchasing power, and the trade press was full of praises for influential users driving technology as opposed to allowing rapacious vendors such as HP and IBM to drive users down proprietary paths that lead to vendor lock-in.
Now that we’ve had a month or more to allow the purple prose to settle a bit, let’s look at the underlying claims, potential impact of the ODCA and the shifting roles of vendors and consumers of technology. And let’s not forget about the role of Intel.
First, let me state unambiguously that one of the core intentions of the ODCA, the desire to develop common use case models that will in turn drive vendors to develop products that comply with the models based on the economic clout of the ODCA members (and hopefully there will be a correlation between ODCA member requirements and those of a wider set of consumers), is a good idea. Vendors spend a lot of time talking to users and trying to understand their requirements, and having the ODCA as a proxy for the requirements of a lot of very influential customers will be a benefit to all concerned.
As an immediate reaction to the recent announcement of Attachmate’s intention to acquire Novell, covered in depth by my colleagues and synthesized by Chris Voce in his recent blog post, I have received a string of inquiries about the probable fate of SUSE LINUX. Should we continue to invest? Will Attachmate kill it? Will it be sold?
Reduced to its essentials the answer is that we cannot predict the eventual ownership of SUSE Linux, but it is almost certain to remain a viable and widely available Linux distribution. SUSE is one of the crown jewels of Novell’s portfolio, with steady growth, gaining market share, generating increasing revenues, and from the outside at least, a profitable business.
Attachmate has two choices with SUSE – retain it as a profitable growth engine and attachment point for other Attachmate software and services, or package it up for sale. In either case they have to continue to invest in the product and its marketing. If Attachmate chooses to keep it, SUSE Linux will behave as it did with Novell. If they sell it, its acquirer will be foolish to do anything else. Speculation about potential acquirers has included HP, IBM, Cisco and Oracle, all of whom could make use of a Linux distribution as an internal product component in addition to the software and service revenues it could engender. But aside from an internal platform, for SUSE to have value as an industry alternative to Red Hat, it would have to remain vendor agnostic and widely available.
With the inescapable caveat that this is a developing situation, my current take on SUSE Linux is that there is no reason to back away from it or to fear that it will disappear into the maw of some giant IT company.
I met recently with Cisco’s UCS group in San Jose to get a quick update on sales and maybe some hints about future development. The overall picture is one of rapid growth decoupled from whatever pressures Cisco management has cautioned about in other areas of the business.
Overall, according to recent disclosure by Cisco CEO John Chambers, Cisco’s UCS revenue is growing at a 550% Y/Y growth rate, with the most recent quarterly revenues indicating a $500M run rate (we make that out as about $125M quarterly revenue). This figure does not seem to include the over 4,000 blades used by Cisco IT, nor does it include units being consumed internally by Cisco and subsequently shipped to customers as part of appliances or other Cisco products. Also of note is the fact that it is fiscal Q1 for Cisco, traditionally its weakest quarter, although with an annual growth rate in excess of 500% we would expect that UCS sequential quarters will be marching to a totally different drummer than the overall company numbers.
I have been working on a research document, to be published this quarter, on the impact of 8-socket x86 servers based on Intel’s new Xeon 7500 CPU. In a nutshell, these systems have the performance of the best-of-breed RISC/UNIX systems of three years ago, at a substantially better price, and their overall performance improvement trajectory has been steeper than competing technologies for the past decade.
This is probably not shocking news and is not the subject of this current post, although I would encourage you to read it when it is finally published. During the course of researching this document I spent time trying to prove or disprove my thesis that x86 system performance solidly overlapped that of RISC/UNIX with available benchmark results. The process highlighted for me the limitations of using standardized benchmarks for performance comparisons. There are now so many benchmarks available that system vendors are only performing each benchmark on selected subsets of their product lines, if at all. Additionally, most benchmarks suffer from several common flaws:
They are results from high-end configurations, in many cases far beyond the norm for any normal use cases, but results cannot be interpolated to smaller, more realistic configurations.
They are often the result of teams of very smart experts tuning the system configurations, application and system software parameters for optimal results. For a large benchmark such as SAP or TPC, it is probably reasonable to assume that there are over 1,000 variables involved in the tuning effort. This makes the results very much like EPA mileage figures — the consumer is guaranteed not to exceed these numbers.
Fujitsu? Who? I recently attended Fujitsu’s global analyst conference in Boston, which gave me an opportunity to check in with the best kept secret in the North American market. Even Fujitsu execs admit that many people in this largest of IT markets think that Fujitsu has something to do with film, and few of us have ever seen a Fujitsu system installed in the US unless it was a POS system.
So what is the management of this global $50 Billion information and communications technology company, with a competitive portfolio of client, server and storage products and a global service and integration capability, going to do about its lack of presence in the world’s largest IT market? In a word, invest. Fujitsu’s management, judging from their history and what they have disclosed of their plans, intends to invest in the US over the next three to four years to consolidate their estimated $3 Billion in N. American business into a more manageable (simpler) set of operating companies, and to double down on hiring and selling into the N. American market. The fact that they have given themselves multiple years to do so is very indicative of what I have always thought of as Fujitsu’s greatest strength and one of their major weaknesses – they operate on Japanese time, so to speak. For an American company to undertake to build a presence over multiple years with seeming disregard for quarterly earnings would be almost unheard of, so Fujitsu’s management gets major kudos for that. On the other hand, years of observing them from a distance also leads me to believe that their approach to solving problems inherently lacks the sense of urgency of some of their competitors.