NVIDIA recently shared a case study involving risk calculations at a JP Morgan Chase that I think is significant for the extreme levels of acceleration gained by integrating GPUs with conventional CPUs, and also as an illustration of a mainstream financial application of GPU technology.
JP Morgan Chase’s Equity Derivatives Group began evaluating GPUs as computational accelerators in 2009, and now runs over half of their risk calculations on hybrid systems containing x86 CPUs and NVIDIA Tesla GPUs, and claims a 40x improvement in calculation times combined with a 75% cost savings. The cost savings appear to be derived from a combination of lower capital costs to deliver an equivalent throughput of calculations along with improved energy efficiency per calculation.
Implicit in the speedup of 40x, from multiple hours to several minutes, is the implication that these calculations can become part of a near real-time business-critical analysis process instead of an overnight or daily batch process. Given the intensely competitive nature of derivatives trading, it is highly likely that JPMC will enhance their use of GPUs as traders demand an ever increasing number of these calculations. And of course, their competition has been using the same technology as well, based on numerous conversations I have had with Wall Street infrastructure architects over the past year.
My net take on this is that we will see a succession of similar announcements as GPUs become a fully mainstream acceleration technology as opposed to an experimental fringe. If you are an I&O professional whose users are demanding extreme computational performance on a constrained space, power and capital budget, you owe it to yourself and your company to evaluate the newest accelerator technology. Your competitors are almost certainly doing so.
The Nebula appliance announced today jumps right into this space and provides a standardized hardware configuration for OpenStack implementations. It offers scaled-out compute power based on commoditized x86 CPUs and standardizes a configuration of switches and other components to glue a large number of these CPUs together. The new VC-backed startup will thus compete head to head with EMC’s Vblock and Microsoft’s Azure appliance; neither of these are based on open source, and the latter isn’t really on the market yet.
But Nebula is more than just a hardware deliverable. Its mission is to transparently standardize the cloud hardware stack. Basically, it’s nothing more than the complex specification Microsoft worked out with its hardware partners (Dell, Fujitsu, and HP) to deliver the Azure appliance to local cloud providers and large-scale private clouds. However, Nebula’s openness is the differentiator; it reminds me a bit of IBM’s approach around the original personal computer back in the 1970s. Sure, it enabled hardware competitors to produce compatible PCs — but it also brought mass adoption of the PC, outperforming Apple over four decades.
If Nebula delivers a compelling price point, it has an appealing approach that could gain significant share in the growing cloud hardware market. If the new company aims to spur a revolution similar to that of the PC, its founders need to tweak their strategy soon:
Over the past months server vendors have been announcing benchmark results for systems incorporating Intel’s high-end x86 CPU, the E7, with HP trumping all existing benchmarks with their recently announced numbers (although, as noted in x86 Servers Hit The High Notes, the results are clustered within a few percent each other). HP recently announced new performance numbers for their ProLiant DL980, their high-end 8-socket x86 server using the newest Intel E7 processors. With up to 10 cores, these new processors can bring up to 80 cores to bear on large problems such as database, ERP and other enterprise applications.
The performance results on the SAP SD 2-Tier benchmark, for example, at 25160 SD users, show a performance improvement of 35% over the previous high-water mark of 18635. The results seem to scale almost exactly with the product of core count x clock speed, indicating that both the system hardware and the supporting OS, in this case Windows Server 2008, are not at their scalability limits. This gives us confidence that subsequent spins of the CPU will in turn yield further performance increases before hitting system of OS limitations. Results from other benchmarks show similar patterns as well.
Key takeaways for I&O professionals include:
Expect to see at least 25% to 35% throughput improvements in many workloads with systems based on the latest the high-performance PCUs from Intel. In situations where data center space and cooling resources are constrained this can be a significant boost for a same-footprint upgrade of a high-end system.
For Unix to Linux migrations, target platform scalability continues become less of an issue.
On July 5th, First Trust launched an exchange traded fund (ETF) designed to help investors capitalize on the growing market for cloud computing. I'd be excited about this sign of maturity for the market if the fund let you invest in the companies that are truly driving cloud computing, but most of them aren't publicly traded. Now don't get me wrong, there are clearly some cloud leaders in the ISE Cloud Index, such as Amazon, saleforce.com and Netflix, but many of the stocks in this fund are traditional infrastructure players who get a fraction (at most) of their revenues from cloud computing, such as Polycom, Teradata and Iron Mountain. The fund is a mix of cloud leaders, arms dealers and companies who are directionally heading toward the cloud - dare I say "cloudwashing" their traditional revenue streams.
The bigger question, though, is should anyone invest in this fund? Ignore the name and why not. Many of these stocks are market leaders in their respective areas, so if you are looking for a good technology fund, this is probably as good as any.
Mark this date. While it isn't an anniversary of anything significant in the past, it is a day where our beloved cloud computing market showed significant signs of maturing. Major announcements by VMware, Citrix, and Microsoft all signaled significant progress in making cloud platforms (infrastructure-as-a-service [IaaS] and platform-as-a-service [PaaS]) more enterprise ready and consumable by I&O professionals.
* VMware updates its cloud stack.The server virtualization leader announced version 5 of its venerable hypervisor and version 1.5 of vCloud Director, its IaaS platform atop vSphere. Key enhancements to vCloud include more hardening of its security and resource allocation policy capabilities that address secure multitenancy concerns and elimination of the "noisy neighbor" problem, respectively. It also doubled the total capacity of VMs service providers can put in a single cloud to 20,000. VMware also resurrected a key feature from its now defunct Lab Manager — linked clones. This key capability for driving operational efficiency lets you deploy new VMs from the image library and the system will maintain the relationship between the golden image and the deployed VM. This does two things; it minimizes the storage footprint of the VM, much as similar technology does in virtual desktops, and second it uses the link to ensure clones maintain the patch level and integrity of the golden master. This alone is reason enough to consider vCloud Director.
After considerable speculation and anticipation, VMware has finally announced vSphere 5 as part of a major cloud infrastructure launch, including vCloud Director 1.5, SRM 5 and vShield 5. From our first impressions, it is both well worth the wait and merits immediate serious consideration as an enterprise virtualization platform, particularly for existing VMware customers.
The list of features is voluminous, with at least 100 improvements, large and small, but among the features, several stand out as particularly significant as I&O professionals continue their efforts to virtualize the data center, primarily dealing with and support for both larger VMs and physical host systems, and also with the improved manageability of storage and improvements Site Recovery Manager (SRM), the remote-site HA components:
Replication improvements for Site Recovery Manager, allowing replication without SANs
Distributed Resource Scheduling (DRS) for Storage
Support for up to 1 TB of memory per VM
Support for 32 vCPUs per VM
Support for up to 160 Logical CPUs and 2 TB or RAM
New GUI to configure multicore vCPUs
Storage driven storage delivery based on the VMware-Aware Storage APIs
Improved version of the Cluster File System, VMFS5
Storage APIs – Array Integration: Thin Provisioning enabling reclaiming blocks of a thin provisioned LUN on the array when a virtual disk is deleted
Swap to SSD
2TB+ LUN support
Storage vMotion snapshot support
vNetwork Distributed Switch improvements providing improved visibility in VM traffic
vCenter Server Appliance
vCenter Solutions Manager, providing a consistent interface to configure and monitor vCenter-integrated solutions developed by VMware and third parties
Revamped VMware High Availability (HA) with Fault Domain Manager
While NVIDIA and to a lesser extent AMD (via its ATI branded product line) have effectively monopolized the rapidly growing and hyperbole-generating market for GPGPUs, highly parallel application accelerators, Intel has teased the industry for several years, starting with its 80-core Polaris Research Processor demonstration in 2008. Intel’s strategy was pretty transparent – it had nothing in this space, and needed to serve notice that it was actively pursuing it without showing its hand prematurely. This situation of deliberate ambiguity came to an end last month when Intel finally disclosed more details on its line of Many Independent Core (MIC) accelerators.
Intel’s approach to attached parallel processing is radically different than its competitors and appears to make excellent use of its core IP assets – fabrication and expertise and the x86 instruction set. While competing products from NVIDIA and AMD are based on graphics processing architectures, employing 100s of parallel non-x86 cores, Intel’s products will feature a smaller (32 – 64 in the disclosed products) number of simplified x86 cores on the theory that developers will be able to harvest large portions of code that already runs on 4 – 10 core x86 CPUs and easily port them to these new parallel engines.
Back during the dot.com boom years, existing telcos and dozens of new network operators, especially in western Europe and North America, laid vast amounts of fiber optic networks in anticipation of rapidly rising Internet usage and traffic. When the expected volumes of Internet usage failed to materialize, they did not turn on or “light up” most (some estimate 80% and even 90% on many routes) of this fiber network capacity. This unused capacity was called “dark fiber,” and it has only been in recent years that this dark fiber has been put to use.
I am seeing early signs of something similar in the build-out of infrastructure-as-a-service (IaaS) cloud offerings. Of course, the data centers of servers, storage devices, and networks that IaaS vendors need can scale up in a more linear fashion (add another rack of blade servers as needed to support an new client) than the all-or-nothing build-out of fiber optic networks, so the magnitude of “dark cloud” will never reach the magnitude of “dark fiber.” Nonetheless, if current trends continue and accelerate, there is a real potential for IaaS wannabes creating a glut of “dark cloud” capacity that exceeds actual demand, with resulting downward pressure on prices and shakeouts of unsuccessful IaaS providers.
Recent Forrester inquiries from enterprise infrastructure and operations (I&O) professionals show that there's still significant confusion between infrastructure-as-a-service (IaaS) private clouds and server virtualization environments. As a result, there are a lot of misperceptions about what it takes to get your private cloud investments right and drive adoption by your developers. The answers may surprise you; they may even be the opposite of what you're thinking.
From speaking with Forrester clients who have deployed successful private clouds, we've found that your cloud should be smaller than you think, priced cheaper than the ROI math would justify and actively marketed internally - no, private clouds are not a Field of Dreams. Our latest report, "Q&A: How to Get Private Cloud Right," details this unconventional thinking, and you may find that internal clouds are much easier than you think.
First and foremost, if you think the way you operate your server virtualization environment today is good enough to call a cloud, you are probably lying to yourself. Per the Forrester definition of cloud computing, your internal cloud must be:
Highly standardized - meaning that the key operational procedures of your internal IaaS environment (provisioning, placement, patching, migration, parking and destroying) should all be documented and conducted the same way every time.
Highly automated - and to make sure the above standardized procedures are done the same time every time, you need to take these tasks out of human error and hand them over to automation software.
What is one of the most important decisions infrastructure & operations (I&O) professionals face today? It's not whether to leverage the cloud or build a private cloud or even which cloud to use. The more important decision is which applications to place in the cloud, and sadly this decision isn't often made objectively. Application development & delivery professionals often decide on their own by bypassing IT. When the decision is made in the open with all parts of IT and the business invited to collaborate, emotion and bravado often rule the day. "SAP's a total pain and a bloated beast, let's move that to the cloud," one CIO said to his staff recently. His belief was if we can do that in the cloud it will prove to the organization that we can move anything to the cloud. Sadly, while a big bang certainly would garner a lot of attention, the likelihood that this transition would be successful is extremely low, and a big bang effort that becomes a big disaster could sour your organization on the cloud and destroy IT's credibility. Instead, organizations should start with low risk applications that let you learn safely how to best leverage the cloud — whether public or private.