I recently had an opportunity to spend a day in three separate meetings with infrastructure & operations professionals from three of the top six financial service firms in the country, and discuss topics ranging from long-term business and infrastructure strategy to specific likes and dislikes regarding their Tier-1 vendors and their challengers. The day’s meetings were neither classic consulting nor classic briefings, but rather a free-form discussion, guided only loosely by an agenda and, despite possible Federal regulations to the contrary, completely devoid of PowerPoint presentations. As in the past, these in-depth meetings provided a wealth of food for thought, interesting and sometimes contradictory indicators from the three groups. There was a lot of material to ponder, but I’ll try and summarize some of the high-level takeaways in this post.
Servers and Vendors
These companies between them own in the neighborhood of 180,000 servers, and probably purchase 30,000 - 50,000 servers per year in various cycles of procurements. In short, these are heavyweight users. One thing that struck me in the course of the conversations was the Machiavellian view of their Tier-1 server vendors. Viewed as key partners, at the same time the majority of this group of users devoted a substantial amount of time to keeping their key vendors at arm’s length through aggressive vendor management techniques like deliberate splitting of procurements between competitors. They understand their suppliers' margins and cost structures well, and are committed to driving hardware supplier margins to “as close to zero as we can,” in the words of one participant.
Product strategists should check out this article in today’s New York Times about online borrowing. Think of it as a Web-empowered peer-to-peer product rental program. The article describes how Web sites like SnapGoods allow private owners of products to rent them out for temporary periods of time to consumers who want to use – but do not (or cannot) own – those same products. It’s a product rental marketplace, smaller than but resembling a product sales marketplace (like eBay).
This peer-to-peer product rental approach to sharing complements another sharing technique that has been around for a while: timesharing. Vacationers who own 1/8 of a condominium in the Bahamas get to use it part of the time, as do their fellow timeshare partners. More recently, the Web enabled Zipcar to grow to over 275,000 users by 2009. Zipcar users make reservations to use vehicles in their neighborhoods on an hourly basis.
There has been turmoil and angst recently in the 0pen source community of late over Oracle’s decision to cancel OpenSolaris. Since this community can be expected to react violently anytime something is taken out of open source, the real question is whether this action has any impact on real-world IT and operations professionals. The short answer is no.
Enterprise Solaris users, be they small, medium or large, are using it to run critical applications; and as far as we can tell, the uptake of OpenSolaris as opposed to Solaris supplied and sold by Sun was very low in commercial accounts, other than possibly a surge in test and dev environments. The decision to take Solaris into the open source arena was, in my opinion, fundamentally flawed, and Oracle’s subsequent decision to change this is eminently rational – Oracle’s customers almost certainly are not going to run their companies on an OS that is built and maintained by any open source community (even the vast majority of corporate Linux use is via a distribution supported by a major vendor and under a paid subscription model), and Oracle cannot continue to develop Solaris unless they have absolute control over it, just as is the case with every other enterprise OS. In the same vein, unless Oracle can also have an expectation of being compensated for their investments in future Solaris development, there is little motivation for them to continue to invest heavily in Solaris.
Yesterday, I participated in one of the regular content planning sessions for us analysts on Forrester’s IT Infrastructure & Operation’s Research team. Similar to investment managers and their portfolio of stocks or bonds, we spent time making buy/hold/sell decisions on what we will research more, continue to research, or stop researching. Among the many criteria we use to make these decisions, like client readership, inquiries, or consulting, the strategic relevancy to IT is an important factor to consider. And there was some heated debate around research themes we may phase out down the road…
Enter the discussion on IT asset disposition – or the process of reselling, donating, or recycling end-of-life IT equipment. While every organization eventually has to dispose of its end-of-life IT equipment, it’s long been an afterthought. And the data backs this up. Forrester finds that 80% of organizations globally use their OEM, third parties or a combination of the two for IT asset disposition. But when asked how important IT asset disposition is relative to other IT asset management processes, it’s far and away the least important. As an indicator of this, I recently surveyed over 300 European IT professionals where 77% of respondents ranked IT asset disposition “less important” or “least important.”
This begs the question, is disposing of end-of-life IT equipment really strategic?
Historically, the positioning of Dell versus its two major competitors for high-value enterprise business, particularly where it involved complex services and the ability to deliver deeply integrated infrastructure and management stacks, has been as sort of an also ran. Competitors looked at Dell as a price spoiler and a channel for standard storage and networking offerings from its partners, not as a potential threat to the high-ground of being able to deliver complex integrated infrastructure solutions.
This comforting image of Dell as being a glorified box pusher appears to be coming to an end. When my colleague Andrew Reichman recently wrote about Dell’s attempted acquisition of 3Par, it made me take another look at Dell’s recent pattern of investments and the series of announcements they have made around delivering integrated infrastructure with a message and solution offering that looks like it is aimed squarely at HP and IBM's Virtual Fabric.
Events are, and have been for quite some time, the fundamental elements of IT infrastructure real-time monitoring. Any status changed, threshold crossed in device usage, or step performed in a process generates an event that needs to be reported, analyzed, and acted upon by IT operations.
Historically, the lower layers of IT infrastructure (i.e., network components and hardware platforms) have been regarded as the most prone to hardware and software failures and have therefore been the object of all attention and of most management software investments. In reality, today’s failures are much more likely to be coming from the application and the management of platform and application updates than from the hardware platforms. The increased infrastructure complexity has resulted in a multiplication of events reported on IT management consoles.
Over the years, several solutions have been developed to extract the truth from the clutter of event messages. Network management pioneered solutions such as rule engines and codebook. The idea was to determine, among a group of related events, the original straw that broke the camel’s back. We then moved on to more sophisticated statistical and pattern analysis: Using historical data we could determine what was normal at any given time for a group of parameters. This not only reduces the number of events, it eliminates false alerts and provides a predictive analysis based on parameters’ value evolution in time.
The next step, which has been used in industrial process control and in business activities and is now finding its way into IT management solutions, is complex event processing (CEP).
In a recent discussion with a group of infrastructure architects, power architecture, especially UPS engineering, was on the table as a topic. There was general agreement that UPS systems are a necessary evil, cumbersome and expensive beasts to put into a DC, and a lot of speculation on alternatives. There was general consensus that the goal was to develop a solution that would be more granular install and deploy and thus allow easier and ad-hoc decisions about which resources to protect, and agreement that battery technologies and current UPS architectures were not optimal for this kind of solution.
So what if someone were to suddenly expand battery technology R&D investment by a factor of maybe 100x of R&D and into battery technology, expand high-capacity battery production by a giant factor, and drive prices down precipitously? That’s a tall order for today’s UPS industry, but it’s happening now courtesy of the auto industry and the anticipated wave of plug-in hybrid cars. While batteries for cars and batteries for computers certainly have their differences in terms of depth and frequency of charge/discharge cycles, packaging, lifespan, etc, there is little doubt that investments in dense and powerful automotive batteries and power management technology will bleed through into the data center. Throw in recent developments in high-charge capacitors (referred to in the media as “super capacitors”), which add the impedance match between the requirements for spike demands and a chemical battery’s dislike of sudden state changes, and you have all the foundational ingredients for major transformation in the way we think about supplying backup power to our data center components.
Over the past several months, I've been receiving a lot of questions about replication for continuity and recovery. One thing I've noticed, however, is that there is a lot of confusion around replication and its uses. To combat this, my colleague Stephanie Balaouras and I recently put out a research report called "The Past, Present, And Future Of Replication" where we outlined the different types of replication and their use cases. In addition to that, I thought it would be good to get some of the misconceptions about replication cleared up:
Myth: Replication is the same as high availability Reality: Replication can help to enable high availability and disaster recovery, but it is not a solution in and of itself. In the case of an outage, simply having another copy of the data at an alternate site isn't going to help if you don't have a failover strategy or solution. Some host-based replication products come with integrated failover and failback capabilities.
Myth: Replication is too expensive Reality: It's true that traditionally array-based replication has been expensive due to the fact that it requires like-to-like storage and additional licensing fees. However, two factors have mitigated this expense: 1) several storage vendors are no longer charging an extra licensing fee for replication; and 2) there are several alternatives to array-based replication that allow you to use heterogeneous storage and come at a significantly lower acquisition cost. Replication products fall into one of four categories (roughly from most to least expensive):
In a dramatic move this morning, Dell announced the intention to purchase the innovative vendor of enterprise storage 3PAR for $1.15 billion in cash. If there were any doubts remaining about Dell’s commitment to be a force in the storage market alongside of EMC, IBM, HDS et al., this deal should put them to rest. Dell acquired the iSCSI storage array vendor EqualLogic in November of 2007, clustered NAS vendor Exanet in February of this year, and most recently they bought data deduplication vendor Ocarina this past July, as well as putting together a partnership with object storage vendor Caringo. Clearly these are a significant list of deals, but the strategy was incomplete without an enterprise class primary storage system of their own. 3PAR, whose products generally compete with high-end systems in terms of performance and availability, will give Dell the ammunition they need to go head-to-head with the big guys.
Dell has cultivated a relatively successful partnership with EMC for mid-range and enterprise storage for some years, but in spite of Dell’s claim to be invested in that relationship going forward, this deal clearly puts pressure on it. Initially, there is a gap between the SMB focused EqualLogic products and the high-end offerings from 3PAR, which will be filled by the Clariion products from EMC, but in the long run, Dell is likely to be motivated to move out of EMC’s shadow and build its storage brand on proprietary products based on these acquisitions.
This deal ends a good deal of speculation about who might buy 3PAR, with HP the main alternative suitor. HP now faces a build or buy decision as it continues to try to redefine itself in storage amidst a patchwork of the aging EVA platform, partner technology from Hitachi on the high-end, and acquisitions in iSCSI and clustered file storage, but no clearly defined long term vision or anchor technology.
It’s probably fair to say that the computer community is obsessed with speed. After all, our people buy computers to solve problems, and generally the faster the computer, the faster the problem gets solved. The earliest benchmark that I have seen is published in “High Speed Computing Devices, Engineering Resource Devices, McGraw Hill, 1950.” They cite the Marchant desktop calculator as achieving a best-in-class result of 1,350 digits per minute for addition, and the threshold problems then were figuring out how to break down Newton Raphsen equation solvers for maximum computational efficiency. And so the race begins…
Not much has changed since 1950. While our appetites are now expressed in GFLOPs per CPU and TFLOPS per system, users continue to push for escalation of performance in numerically intensive problems. Just as we settled down to a relatively predictable performance model with standard CPUs and cores glued into servers and aggregated into distributed computing architectures of various flavors, along came the notion of attached processors. First appearing in the 1960s and 1970s as attached mainframe vector processors and attached floating point array processors for minicomputers, attached processors have always had a devoted and vocal minority support within the industry. My own brush with them was as a developer using a Floating Point Systems array processor attached to a 32-bit minicomputer to speed up a nuclear reactor core power monitoring application. When all was said and done, the 50X performance advantage of the FPS box had decreased to about 3.5X for the total application. Not bad, but a defeat of expectations. Subsequent brushes with attempts to integrate DSPs with workstations left me a bit jaundiced about the future of attached processors as general purpose accelerators.