Hadoop: Is It Soup Yet?

Blog post info and actions

Blog post body

James Kobielus

Most Hadoop-related inquiries from Forrester clients come to me. These have moved well beyond the “What exactly is Hadoop?” phase to the stage where the dominant query is “Which vendors offer robust Hadoop solutions?”

What I tell Forrester clients is that, yes, Hadoop is real, but that it’s still quite immature. On the “real” side, Hadoop has already been adopted by many companies for extremely scalable analytics in the cloud. On the “immature” side, Hadoop is not ready for broader deployment in enterprise data analytics environments until the following things happen:

  • More enterprise data warehousing (EDW) vendors adopt Hadoop. Of the vendors in my recent Forrester Wave™ for EDW platforms, only IBM and EMC Greenplum have incorporated Hadoop into the core of their solution portfolios. Other leading EDW vendors interface with Hadoop only partially and only at arm’s length. We strongly believe that Hadoop is the nucleus of the next-generation cloud EDW, but that promise is still three to five years from fruition. It’s likely that most EDW vendors will embrace Hadoop more fully in the coming year, with strategic acquisitions the likely route.
  • Early implementers converge on a core Hadoop stack. The companies I’ve interviewed as case studies indicate that the only common element in Hadoop deployments is the use of MapReduce as the modeling abstraction layer. We can’t say Hadoop is ready-to-serve soup until we all agree to swirl some common ingredients into the bubbling broth of every deployment. And the industry should clarify the reference framework within which new Hadoop specs are developed.
Read more

HP Acquiring Vertica, Ramping Steadily Back Into Enterprise Data Warehousing Market

Blog post info and actions

Blog post body

James Kobielus

So you thought Neoview’s demise signaled HP’s exit from the enterprise data warehousing (EDW) market? You could not have been more mistaken. 

Yes, HP recently abandoned that slow-motion train wreck after several years of anemic customer adoption and directionless product management. But you should regard the unlamented Neoview as little more than a blip in HP’s long-running campaign to deepen its presence in all things EDW. And you should consider today’s announcement that they’re acquiring Vertica Systems as a key building block in HP’s emerging new strategy in the EDW market. It remains to be seen what that strategy is, inasmuch as new HP CEO Leo Apotheker has not yet articulated a clear vision. Perhaps he’s playing close to the vest so as not to call attention to further acquisitions in adjacent segments, such as business intelligence (BI) and data integration (DI).

Read more

Waves of Innovation Continue to Transform Enterprise Data Warehousing Market

Blog post info and actions

Blog post body

James Kobielus

Every Forrester Wave is an in-depth snapshot of an entire vendor market segment at a particular point in time. 

Yesterday was that point in time for the latest update to the Forrester Wave for Enterprise Data Warehousing (EDW) Platforms. We just published this update after a grueling 8-month process of revisiting the criteria, scales, and weights associated with the most differentiating features of EDW vendors’ complex solution value propositions. Clearly, the EDW market has evolved considerably in the 2 years since we published the first installment of the Wave. 

At the highest level of analysis, what’s new? For starters, more vendors made the crucial inclusion criterion of having at least 100 customers with in-production EDW deployments. The field this time around included all seven vendors in the 2009 Wave, plus Greenplum (now part of EMC) and Vertica. Clearly, big-brand EDW mergers and acquisitions consolidated the original seven down to 5, as IBM acquired Netezza and SAP purchased Sybase. Please bear in mind that, as these acquisitions were reasonably recent, we chose to evaluate the acquired vendors’ offerings separately from their new parents' pre-acquisition EDW offerings in this latest Wave (it's worth noting that neither IBM nor SAP has slightest intention of discontinuing their pre-acquisition EDW portfolios--if anything, they're both now evolving those product families even more aggressively post-acquisition). 

Read more

Predictions And Plans For Business Analytics In 2011

Blog post info and actions

Blog post body

James Kobielus

I love reporters. As someone with an M.A. in journalism who then evolved into an analyst, I recognize that both professions occupy approximately the same tier in the industry food chain. In fact, many IT industry analysts were trade press reporters at one point in their careers, and it’s not uncommon for analysts to go back into media institutions later on.

When great longtime IT reporters, such as Computerworld’s Jaikumar Vijayan, call me up to get my thoughts, I’m just as interested in their take on what’s important. Jai recently published an excellent article with my predictions, plus those of another analyst, on the year ahead in analytics. To the jaded reader, these sorts of year-end look-ahead articles may feel like perfunctory rehashes of stuff we’ve been telling them for quite some time, perhaps with a trendy new buzzword thrown in to keep it remotely glance-worthy.

I try not to repeat myself too much. Rather than regurgitate the statements I made in the phone interview with Jai, I’ll highlight how I’m addressing the principal business-analytics trends that I discussed with him — self-service, pervasive, social, scalable, cloud, and real-time—in our 2011 Forrester research agenda:

Read more

Key Data Analytics Projects For Building Your Process Optimization Program

Blog post info and actions

Blog post body

James Kobielus

Rome was not reinvented in a day. Your enterprise business processes won’t turn around overnight either. You’ll need to re-engineer processes while you continue to run an ongoing business concern — albeit one with many buried layers, some splendid ruins, and many construction projects that cause never-ending traffic snarls.

Business process optimization is not a project you can deliver in a fortnight, nor is it a specific architecture or business model. Rather, it’s an ongoing program under which you implement various transformative technical projects in order to enable greater agility, efficiency, and effectiveness throughout key processes.

What are the key components of a business process optimization program? Forrester recommends that you establish an ongoing initiative that involves all business stakeholders, at all levels in the organization. Just as important, you will need to establish tight collaboration between business stakeholders and the myriad change agents, business architects, process architects, business analysts, data stewards, and analytics professionals upon which the success of your optimization efforts depends.

Enterprises should establish cross-functional programs under which to prioritize business process optimization projects around the following key pillars:

Read more

What’s Next On The MDM Horizon?

Blog post info and actions

Blog post body

Rob Karel

Many large organizations have finally “seen the light” and are trying to figure out the best way to treat their critical data as the trusted asset it should be. As a result, master data management (MDM) strategies, and the enabling architectures, organizational and governance models, methodologies and technologies that support the delivery of MDM capabilities are…in a word…HOT! But the concept of MDM - and the homegrown or vendor-enabled technologies that attempt to deliver that elusive “single version of truth”, “golden record”, or “360-degree view” - has been around for decades in one form or another (e.g., data warehousing, BI, data quality, EII, CRM, ERP, etc. have all at one time or another promised to deliver that single version of truth in one form or another).

The current market view of MDM has matured significantly over the past 5 years, and today many organizations are on their way to successfully delivering multi-domain/multi-form master data solutions across various physical and federated architectural approaches. But the long-term evolution of the MDM concept is far from over. There remains a tremendous gap in what limited business value most MDM efforts deliver today compared to what all MDM and data management evangelists feel MDM is capable of delivering in terms of business optimization, risk mitigation, and competitive differentiation.

What will the next evolution of the MDM concept look like in the next 3, 5 and 10 years? Will the next breakthrough be one that’s focused on technology enablement? How about information architecture? Data governance and stewardship? Alignment with other enterprise IT and business strategies?

Read more

Is EMC Moving To Tackle Data Management Market?

Blog post info and actions

Blog post body

Rob Karel

As I’m sure everyone has heard by now, EMC acquired data warehouse appliance vendor, Greenplum.  I don’t cover the data warehousing and analytics space, I leave that to my colleague Jim Kobielus who discussed this acquisition on his blog. While many data warehousing and analytics thought leaders will debate the likelihood that this acquisition will spark a wave of consolidation in the DW/analytics space, I’d like to focus on what’s going through the mind of the acquirer: EMC.   I was intrigued by this acquisition because EMC has been on my very short list of potential new entrants into the data management software space, especially concerning master data management (MDM) and data integration.

So to cut to the chase, in this blog post I’m going to recommend to EMC that they acquire both Informatica and TIBCO and challenge IBM for world domination of the information management market.  Here’s my thought process:

EMC hinting at data management interest for a while now

EMC, a $14+billion information management powerhouse, has a product portfolio very focused on the unstructured content side of the information landscape. According to its latest 10K filed earlier this year, most of its business comes from its hardware and software Information Storage solutions ($10.7 billion), but significant business also contributed by its VMWare Virtual Infrastructure ($2+ billion), Enterprise Content Management and Archiving ($740 million) as well as its RSA Information Security solutions ($606 million).

Read more

EMC Acquiring Greenplum, Paving the Way to Emergence of Virtualized Cloud Data Warehousing

Blog post info and actions

Blog post body

James Kobielus

Suddenly but not surprisingly, EMC has jumped headlong into the data warehousing (DW) market via strategic acquisition. Now that the deal is in the works, it’s clear that EMC perceived a “low-hanging plum” in one of its established DW partners, and simply made the right offer at the right time.

The EMC/Greenplum deal signals that the DW market is probably moving into a new round of consolidations. Just as Oracle acquired Sun in part to offer fully one-stop integrated DW appliances (i.e., Oracle Exadata over Oracle Sun hardware), EMC comes from the other end of the telescope: acquiring a DW software vendor to layer on top of its hardware and possibly leverage its other software technologies at a later date. In this way, EMC now becomes one of the few DW appliance vendors that can provide a reasonably full stack of hardware and software from its own portfolio.

When I say “reasonably full” in this context, I’m referring to the hardware (EMC is storage, but will still rely on third-parties to provide the server and interconnects), database (Greenplum has extended/customized BizGreSQL), query planning/optimization, data-loading, and workload management (Greenplum has built its own technology in those three areas). In this regard, EMC/Greenplum will be one of several “integrated one-stop hardware/software stack” DW appliance vendors on the market: IBM, Oracle/Sun, and HP are others. Interestingly, all of these vendors also supply hardware to rival “hardware-less” DW appliance vendors, and it’s likely that EMC/Greenplum, like these companies, will offer its own optimized DW-appliance stack while offering an equivalent degree of hardware optimization for partners.

Read more

Commercializing Enterprise-Grade Hadoop: Tools For Harnessing Petabyte Analytics

Blog post info and actions

Blog post body

James Kobielus

Hadoop is riding the hype wave right now. You’ll find many IT professionals who know just enough about Hadoop to be dangerous in a cocktail party setting, but not enough for their own comfort to respond to grilling from the chief technology officer or the geekier business executives.

If you’re slightly bewildered by all the buzz over this new technology with the funny-sounding moniker, you’re not alone. The official story is that Hadoop was the name of the inventor’s kid’s stuffed elephant. However, for most IT professionals, it could easily be an acronym for "Heck, Another Darn Obscure Open-Source Project." The fact that Hadoop, managed by Apache, includes subprojects with similarly opaque names — such as Pig, Hive, Chukwa, and ZooKeeper — contributes to the queasy feeling that this is an untamed menagerie of squealing beasties.

And if you’ve pegged Hadoop as an advanced analytics initiative to mine petabytes of unstructured information, prepare for further bewilderment. The Apache Hadoop project states that it develops open-source software for “reliable, scalable, distributed computing.” Yes, that’s true, but the better-informed among you may be puzzling over the linkages that people often draw between Hadoop, in-database analytics, and MapReduce.

Read more