As Forrester’s lead analyst on data warehousing (DW), my core job often involves diagnosing enterprise analytics practitioners’ DW aches and pains. I try to cultivate a reassuring bedside manner and give them something for both immediate and long-term relief from their problems.
I receive all Forrester customer inquiries on DW matters, many of which are from IT practitioners who have hit a wall of intractable technical, operational, or vendor-related issues. Those sessions usually involve me probing for the source of the IT practitioner’s DW-relevant woes. If all of these issues could be isolated to the DW itself, my life would be much easier. But customers’ DW concerns are often tangled into stubborn knots of business intelligence (BI), master data management (MDM), data integration, data governance, business process management, IT service management, and other critical infrastructure, operations, and application issues. Often a seemingly DW-based problem such as poor-performance queries reveals that the root cause is somewhere else entirely, and the DW itself is the least of their problems.
As I’m sure everyone has heard by now, EMC acquired data warehouse appliance vendor, Greenplum. I don’t cover the data warehousing and analytics space, I leave that to my colleague Jim Kobielus who discussed this acquisition on his blog. While many data warehousing and analytics thought leaders will debate the likelihood that this acquisition will spark a wave of consolidation in the DW/analytics space, I’d like to focus on what’s going through the mind of the acquirer: EMC. I was intrigued by this acquisition because EMC has been on my very short list of potential new entrants into the data management software space, especially concerning master data management (MDM) and data integration.
So to cut to the chase, in this blog post I’m going to recommend to EMC that they acquire both Informatica and TIBCO and challenge IBM for world domination of the information management market. Here’s my thought process:
EMC hinting at data management interest for a while now
EMC, a $14+billion information management powerhouse, has a product portfolio very focused on the unstructured content side of the information landscape. According to its latest 10K filed earlier this year, most of its business comes from its hardware and software Information Storage solutions ($10.7 billion), but significant business also contributed by its VMWare Virtual Infrastructure ($2+ billion), Enterprise Content Management and Archiving ($740 million) as well as its RSA Information Security solutions ($606 million).
As an industry analyst, I’m part of the professional class that delights in defining standard marketplace terminology. More than that, many of us spend our working lives coaxing industry to march under marketing banners aligned with our pet definitions.
Yes, indeed, each analyst likes to feel that his or her marketecture terminology should rule school. Last month I did a Forrester podcast on a topic that’s extremely hot right now: leveraging the power of social media and social networks to manage your brand, drive marketing and sales campaigns, and manage ongoing customer relationships. In that session, I discussed the role of analytics in social media for multichannel customer relationship management (CRM).
My initial impetus for the podcast was to spell out the chief distinctions between two terms that, on first glance, appear almost synonymous: social media analytics and social network analysis. During the podcast I also trucked in another related closely related term—social media monitoring—and even alluded to social intelligence and other phrases that have gained currency.
What follows, for those of you who don’t listen to podcasts, or can’t find them, is the gist of what I said on this topic:
We all know that the war of fighting the proliferation of spreadsheets (as BI or as any other applications) in enterprises has been fought and lost. Gone are the days when BI and performance management vendors web sites had “let us come in and help you get rid of your spreadsheets” message in big bold letters on front pages. In my personal experience – implementing hundreds of BI platforms and solutions – the more BI apps you deliver, the more spreadsheets you end up with. Rolling out a BI application often just means an easier way for someone to access and export data to a spreadsheet. Even though some of the in memory analytics tools are beginning to chip away at the main reasons why spreadsheets in BI are so ubiquitous (self service BI with no modeling or analysis constraints, and little to no reliance on IT), the spreadsheets for BI are here to stay for a long, long, long time.
With that in mind, let me offer a few best practices for controlling and managing (not getting rid of !) spreadsheets as a BI tool:
Create a spreadsheet governance policy. Make it flexible – if it’s not, people will fight it. Here are a few examples of such policies:
- Spreadsheets can be used for reporting and analysis that support processes that do not go beyond individuals or small work groups vs. cross functional, cross enterprise processes
- Spreadsheets can be used for reporting and analysis that are not part of mission critical processes
Whoever said BI market is commoditizing, consolidating and getting very mature? Nothing can be farther from the truth. On the buy side, Forrester still sees tons of less-than-successful BI environments, applications and implementations as demonstrated by Forrester's recent BI Maturity survey. On the vendor/sell side, Forrester also sees a flurry of activity from the startups, small vendors and large, leading BI vendors constantly leapfrogging each other with every major and minor release.
In terms of the amount of BI activity that Forrester sees from our clients (from inquiries, advisories and consulting) there’s no question that SAP BusinessObjects and IBM Cognos continue to dominate client interest. Over the past couple of years Microsoft has typically taken the third place, SAS fourth place and Oracle the distant fifth. But ever since Siebel and Hyperion acquisitions, the landscape has been changing, and we now often see Oracle jumping into third place, sometimes leapfrogging even Microsoft in the levels of monthly interest from Forrester clients.
Hadoop is riding the hype wave right now. You’ll find many IT professionals who know just enough about Hadoop to be dangerous in a cocktail party setting, but not enough for their own comfort to respond to grilling from the chief technology officer or the geekier business executives.
If you’re slightly bewildered by all the buzz over this new technology with the funny-sounding moniker, you’re not alone. The official story is that Hadoop was the name of the inventor’s kid’s stuffed elephant. However, for most IT professionals, it could easily be an acronym for "Heck, Another Darn Obscure Open-Source Project." The fact that Hadoop, managed by Apache, includes subprojects with similarly opaque names — such as Pig, Hive, Chukwa, and ZooKeeper — contributes to the queasy feeling that this is an untamed menagerie of squealing beasties.
And if you’ve pegged Hadoop as an advanced analytics initiative to mine petabytes of unstructured information, prepare for further bewilderment. The Apache Hadoop project states that it develops open-source software for “reliable, scalable, distributed computing.” Yes, that’s true, but the better-informed among you may be puzzling over the linkages that people often draw between Hadoop, in-database analytics, and MapReduce.