Well actually I meant mobs of flash, but I couldn’t resist the word play. Although, come to think of it, flash mobs might be the right way to describe the density of flash memory system vendors here at Oracle Open World. Walking around the exhibits it seems as if every other booth is occupied by someone selling flash memory systems to accelerate Oracle’s database, and all of them claiming to be: 1) faster than anything that Oracle, who already integrates flash into its systems, offers, and 2) faster and/or cheaper than the other flash vendor two booths down the aisle.
All joking aside, the proliferation of flash memory suppliers is pretty amazing, although a venue devoted to the world’s most popular database would be exactly where you might expect to find them. In one sense flash is nothing new – RAM disks, arrays of RAM configured to mimic a disk, have been around since the 1970s but were small and really expensive, and never got on a cost and volume curve to drive them into a mass-market product. Flash, benefitting not only from the inherent economies of semiconductor technology but also from the drivers of consumer volumes, has the transition to a cost that makes it a reasonable alternative for some use case, with database acceleration being probably the most compelling. This explains why the flash vendors are gathered here in San Francisco this week to tout their wares – this is the richest collection of potential customers they will ever see in one place.
In the good old days, computer industry trade shows were bigger than life events – booths with barkers and actors, ice cream and espresso bars and games in the booth, magic acts and surging crowds gawking at technology. In recent years, they have for the most part become sad shadows of their former selves. The great SHOWS are gone, replaced with button-down vertical and regional events where you are lucky to get a pen or a miniature candy bar for your troubles.
Enter Oracle OpenWorld. Mix 45,000 people, hundreds of exhibitors, one of the world’s largest software and systems company looking to make an impression, and you have the new generation of technology extravaganza. The scale is extravagant, taking up the entire Moscone Center complex (N, S and W) along with a couple of hotel venues, closing off a block of a major San Francisco street for a week, and throwing a little evening party for 20 or 30 thousand people.
But mixed with the hoopla, which included wheel of fortune giveaways that had hundreds of people snaking around the already crowded exhibition floor in serpentine lines, mini golf and whack-a-mole-games in the exhibit booths along with the aforementioned espresso and ice cream stands, there was genuine content and the public face of some significant trends. So far, after 24 hours, some major messages come through loud and clear:
I recently published an update on power and cooling in the data center (http://www.forrester.com/go?docid=60817), and as I review it online, I am struck by the combination of old and new. The old – the evolution of semiconductor technology, the increasingly elegant attempts to design systems and components that can be incrementally throttled, and the increasingly sophisticated construction of the actual data centers themselves, with increasing modularity and physical efficiency of power and cooling.
The new is the incredible momentum I see behind Data Center Infrastructure Management software. In a few short years, DCIM solutions have gone from simple aggregated viewing dashboards to complex software that understands tens of thousands of components, collects, filters and analyzes data from thousands of sensors in a data center (a single CRAC may have in excess of 20 sensors, a server over a dozen, etc.) and understands the relationships between components well enough to proactively raise alarms, model potential workload placement and make recommendations about prospective changes.
Of all the technologies reviewed in the document, DCIM offers one of the highest potentials for improving overall efficiency without sacrificing reliability or scalability of the enterprise data center. While the various DCIM suppliers are still experimenting with business models, I think that it is almost essential for any data center operations group that expects significant change, be it growth, shrinkage, migration or a major consolidation or cloud project, to invest in DCIM software. DCIM consumers can expect to see major competitive action among the current suppliers, and there is a strong potential for additional consolidation.
Last year I wrote about Oracle’s new plans for SPARC, anchored by a new line of SPARC CPUs engineered in conjunction with Fujitsu (Does SPARC have a Future?), and commented that the first deliveries of this new technology would probably be in early 2012, and until we saw this tangible evidence of Oracle’s actual execution of this road map we could not predict with any confidence the future viability of SPARC.
The T4 CPU
Fast forward a year and Oracle has delivered the first of the new CPUs, ahead of schedule and with impressive gains in performance that make it look like SPARC will remain a viable platform for years. Specifically, Oracle has introduced the T4 CPU and systems based on them. The T4, an evolution of Oracle’s highly threaded T-Series architecture, is implemented with an entirely new core that will form the basis, with variations in number of threads versus cores and cache designs, of the future M and T series systems. The M series will have fewer threads and more performance per thread, while the T CPUs will, like their predecessors, emphasize throughput for highly threaded workloads. The new T4 will have 8 cores, and each core will have 8 threads. While the T4 emphasizes highly threaded workload performance, it is important to note that Oracles has radically improved single-thread performance over its predecessors, with Oracle claiming performance per thread improvements of 5X over its predecessors, greatly improving its utility as a CPU to power less thread-intensive workloads as well.
At the Hot Chips conference last week, Intel disclosed additional details about the upcoming Poulson Itanium CPU due for shipment early next year. For Itanium loyalists (essentially committed HP-UX customers) the disclosures are a ray of sunshine among the gloomy news that has been the lot of Itanium devotees recently.
Poulson will bring several significant improvements to Itanium in both performance and reliability. On the performance side, we have significant improvements on several fronts:
Process – Poulson will be manufactured with the same 32 nm semiconductor process that will (at least for a while) be driving the high-end Xeon processors. This is goodness all around – performance will improve and Intel now can load its latest production lines more efficiently.
More cores and parallelism – Poulson will be an 8-core processor with a whopping 54 MB of on-chip cache, and Intel has doubled the width of the multi-issue instruction pipeline, from 6 to 12 instructions. Combined with improved hyperthreading, the combination of 2X cores and 2X the total number of potential instructions executed per clock cycle by each core hints at impressive performance gains.
Architecture and instruction tweaks – Intel has added additional instructions based on analysis of workloads. This kind of tuning of processor architectures seldom results in major gains in performance, but every small increment helps.
We have been repeatedly reminded that the requirements of hyper-scale cloud properties are different from those of the mainstream enterprise, but I am now beginning to suspect that the top strata of the traditional enterprise may be leaning in the same direction. This suspicion has been triggered by the combination of a recent day in NY visiting I&O groups in a handful of very large companies and a number of unrelated client interactions.
The pattern that I see developing is one of “haves” versus “have nots” in terms of their ability to execute on their technology vision with internal resources. The “haves” are the traditional large sophisticated corporations, with a high concentration in financial services. They have sophisticated IT groups, are capable fo writing extremely complex systems management and operations software, and typically own and manage 10,000 servers or more. The have nots are the ones with more modest skills and abilities, who may own 1000s of servers, but tend to be less advanced than the core FSI companies in terms of their ability to integrate and optimize their infrastructure.
The divergence in requirements comes from what they expect and want from their primary system vendors. The have nots are companies who understand their limitations and are looking for help form their vendors in the form of converged infrastructures, new virtualization management tools, and deeper integration of management software to automate operational tasks, These are people who buy HP c-Class, Cisco UCS, for example, and then add vendor-supplied and ISV management and automation tools on top of them in an attempt to control complexity and costs. They are willing to accept deeper vendor lock-in in exchange for the benefits of the advanced capabilities.
I recently had an opportunity to spend some time with SUSE management, including President and General Manager Nils Brauckmann, and came away with what I think is a reasonably clear picture of The Attachmate Group’s (TAG) intentions and of SUSE’s overall condition these days. Overall, impressions were positive, with some key takeaways:
TAG has clarified its intentions regarding SUSE. TAG has organized its computer holdings as four independent business units, Novell, NetIQ, Attachmate and SUSE, each one with its own independent sales, development, marketing, etc. resources. The advantages and disadvantages of this approach are pretty straightforward, with the lack of opportunity to share resources aiming the business units for R&D and marketing/sales being balanced off by crystal clear accountability and the attendant focus it brings. SUSE management agrees that it has undercommunicated in the past, and says that now that the corporate structure has been nailed down it will be very aggressive in communicating its new structure and goals.
SUSE’s market presence has shifted to a more balanced posture. Over the last several years SUSE has shifted to a somewhat less European-centric focus, with 50% of revenues coming from North America, less than 50% from EMEA, and claims to be the No. 1 Linux vendor in China, where it has expanded its development staffing. SUSE claims to have gained market share overall, laying claim to approximately 30% of WW Linux market share by revenue.
Focus on enterprise and cloud. Given its modest revenues of under $200 million, SUSE realizes that it cannot be all things to all people, and states that it will be focusing heavily on enterprise business servers and cloud technology, with less emphasis on desktops and projects that do not have strong financial returns, such as its investment in Mono, which it has partnered with Xamarin to continue development,.
Over the past months server vendors have been announcing benchmark results for systems incorporating Intel’s high-end x86 CPU, the E7, with HP trumping all existing benchmarks with their recently announced numbers (although, as noted in x86 Servers Hit The High Notes, the results are clustered within a few percent each other). HP recently announced new performance numbers for their ProLiant DL980, their high-end 8-socket x86 server using the newest Intel E7 processors. With up to 10 cores, these new processors can bring up to 80 cores to bear on large problems such as database, ERP and other enterprise applications.
The performance results on the SAP SD 2-Tier benchmark, for example, at 25160 SD users, show a performance improvement of 35% over the previous high-water mark of 18635. The results seem to scale almost exactly with the product of core count x clock speed, indicating that both the system hardware and the supporting OS, in this case Windows Server 2008, are not at their scalability limits. This gives us confidence that subsequent spins of the CPU will in turn yield further performance increases before hitting system of OS limitations. Results from other benchmarks show similar patterns as well.
Key takeaways for I&O professionals include:
Expect to see at least 25% to 35% throughput improvements in many workloads with systems based on the latest the high-performance PCUs from Intel. In situations where data center space and cooling resources are constrained this can be a significant boost for a same-footprint upgrade of a high-end system.
For Unix to Linux migrations, target platform scalability continues become less of an issue.
Not to be left out of the announcement fever that has gripped vendors recently, Cisco today announced several updates to their UCS product line aimed at easing potential system bottlenecks by improving the whole I/O chain between the network and the servers, and improving management, including:
Improved Fabric Interconnect (FI) – The FI is the top of the UCS hardware hierarchy, a thinly disguised Nexus 5xxx series switch that connects the UCS hierarchy to the enterprise network and runs the UCS Manager (UCSM) software. Previously the highest end FI had 40 ports, each of which had to be specifically configured as Ethernet, FCoE, or FC. The new FI, the model 6248UP has 48 ports, each one of which can be flexibly assigned as up toa 10G port for any of the supported protocols. In addition to modestly raising the bandwidth, the 6248UP brings increased flexibility and a claimed 40% reduction in latency.
New Fabric Extender (FEX) – The FEXC connects the individual UCS chassis with the FI. With the new 2208 FEX, Cisco doubles the bandwidth between the chassis and the FI.
VIC1280 Virtual Interface Card (VIC) – At the bottom of the management hierarchy the new VIC1280 quadruples the bandwidth to each individual server to a total of 80 GB. The 80 GB can be presented as up to 8 10 GB physical NICs or teamed into a pair fo 40 Gb NICS, with up to 256 virtual devices (vNIC, vHBA, etc presented to the software running on the servers.
While NVIDIA and to a lesser extent AMD (via its ATI branded product line) have effectively monopolized the rapidly growing and hyperbole-generating market for GPGPUs, highly parallel application accelerators, Intel has teased the industry for several years, starting with its 80-core Polaris Research Processor demonstration in 2008. Intel’s strategy was pretty transparent – it had nothing in this space, and needed to serve notice that it was actively pursuing it without showing its hand prematurely. This situation of deliberate ambiguity came to an end last month when Intel finally disclosed more details on its line of Many Independent Core (MIC) accelerators.
Intel’s approach to attached parallel processing is radically different than its competitors and appears to make excellent use of its core IP assets – fabrication and expertise and the x86 instruction set. While competing products from NVIDIA and AMD are based on graphics processing architectures, employing 100s of parallel non-x86 cores, Intel’s products will feature a smaller (32 – 64 in the disclosed products) number of simplified x86 cores on the theory that developers will be able to harvest large portions of code that already runs on 4 – 10 core x86 CPUs and easily port them to these new parallel engines.