- Forrester Councils
- Councils Overview
- log in
Posted by James Kobielus on January 30, 2009
Welcome to the life of a data warehousing (DW) industry analyst. I’m often asked by Information and Knowledge Management (I&KM) professionals to address the perennial issues of which commercial DW solution is fastest or most scalable. Vendors ask me too, of course, in the process of their attempting to suss out rivals’ limitations and identify their own competitive advantages.
It’s always difficult for me to provide I&KM pros and their vendors with simple answers to such queries. Benchmarking is the blackest of black arts in the DW arena. It’s intensely sensitive to myriad variables, many of may not be entirely transparent to all parties involved in the evaluation. It’s intensely political, because the answer to that question can influence millions of dollars of investments in DW solutions. And it’s a slippery slope of overqualified assertions that may leave no one confident that they’re making the right decisions. Yes, I’m as geeky as the next analyst, but I myself feel queasy when a sentence coming out of my mouth starts to run on with an unending string of conditional clauses.
If we offer any value-add in the DW arena’s commentary cloud, industry analysts can at least clarify the complexities. Here is how I frame the benchmarking issues that should drive I&KM pros’ discussions with DW vendors:
As an entirely separate issue, it does no good, competitively, for a DW vendor to assert performance enhancements that are only relative to a prior configuration of a prior version of its own product or technology. The customer has no easy assurance that the vendor is comparing its current solutions against a well-configured/engineered example of the prior solution. The vendor’s assertion of order-of-magnitude improvement over a prior version of its own product may be impressive, but only as a statement of how much they’ve improved its own technology, not how it fares against the competition. And such “past-self-comparisons” can easily backfire on the vendor, as customers and competitors may use it to insinuate that there were significant flaws or limitations in your legacy products.
Here’s my bottom-line advice to all DW vendors on positioning your performance assertions. Frame them in the context of the architectural advantages of your specific DW technical approach. Publish your full benchmark numbers with test configurations, scenarios, and cases explicitly spelled out. To the extent that you can aggregate 100s of terabytes of data, serve thousands of concurrent users and queries, process complex mixtures of queries, joins, and aggregations, ensure sub second ingest-to-query latencies, and support continuous, high-volume, multiterabyte batch data loading, call all of that out in your benchmarks. To the extent that any or all of that is in your roadmap, call that out, too.
Here’s my bottom-line advice to I&KM pros: Don’t expect easy answers. Think critically about all vendor-reported DW benchmarks. And recognize that no one DW platform can possibly be configured optimally for all requirements, transactions, and applications.
If there were any such DW platforms, I’d be aware of them. Know any?