Analytic Databases Power BI Boom

Jameskobielus_2By James Kobielus

Analytic databases are the principal engines driving business intelligence (BI), delivering operational data into reports, dashboards, and ad-hoc queries.

Essential as they may be, analytic databases have been largely overlooked in the BI industry’s recent consolidation spree. Sitting at the core of data warehouses (DWs) everywhere, these data stores have been treated as mere plumbing rather than as differentiating platform components. Instead, most recent BI mergers have been driven by vendors’ desire to beef up their financial analytic applications, or add more sophisticated visualization, search, and other access-oriented features to their BI platforms.

Though often taken for granted, analytic databases will almost certainly become a key BI solution differentiator over the next several years. With the trend toward commoditization of core BI features, more vendors will distinguish their offerings through the speed, scalability, throughput, and mixed-workload support that only a well-tuned analytic database can provide. Every self-respecting BI vendor will boast that their analytic database can handle more concurrent users, process more complex multidimensional queries, load bulk data more rapidly, execute more compute-intensive transforms, and manage more massive data sets than the competition. Just as important, they’ll brag that they can do all this more cheaply than the next guy.

In an increasingly commoditized BI market, analytic price-performance is becoming the principal buying criterion. This trend is fueling the industry’s growing focus on analytic appliances, which are also called BI appliances or data warehousing (DW) appliances. Indeed, most of the leading BI vendors--SAP/Business Objects, IBM/Cognos, Oracle, Microsoft, and SAS Institute--provide their own analytic appliances now, or are developing appliance-based offerings on their own or with partners. Though these vendors will continue to deliver BI/DW solutions as packaged software offerings, they all see the appeal of appliances as turnkey solutions for many customer requirements. Midmarket customers, in particular, are taking a keen interest in appliances, which provide them with quick-deployment pre-optimized solutions and thereby relieve the burden on their limited technical staffs.

As analytic appliances become central to enterprises’ BI strategies, DW appliances will evolve into full-fledged BI platforms in their own right. Appliance vendors such as Teradata, HP, Netezza, Greenplum, DATAllegro, Dataupia, and ParAccel will expand their ability to run “in-database analytics” and other applications developed in-house, or by partners and customers. Appliance vendors will outdo each other in tuning database features--such as indexing, partitioning, in-memory caching, compression, cubing, tokenization, and query-plan optimization--that are geared for managing myriad analytic workloads. And every appliance vendor will beef up their hardware’s scalability through massively parallel processing, clustering, workload management, and other ongoing enhancements.

In addition, every vendor of column-oriented databases--which are exquisitely well-suited to data-intensive query processing--will soon either realign its go-to-market strategy around appliances or get out of the analytics market altogether. The performance advantages of a hardware-optimized column-oriented database over software-only rivals will be too pronounced for the latter to hold onto their market share. And though most appliance vendors currently eschew column-oriented approaches, preferring to tweak traditional row-oriented RDBMSs for multidimensional online analytical processing (OLAP), many will explore this alternative technique in order to eke out further performance improvements.

The growing demand for cheap analytic horsepower will also foster the development of subscription-based DW services, also known as “DW 2.0,” “Database 2.0,” “cloud databases,” and “on-demand databases.” Though not the first entrant in this new arena, Microsoft is the most prominent, having recently rolled out a limited beta of its hosted SQL Server Data Services (SSDS), which is slated for full production release in 2009. Under SSDS, Microsoft hosts a subset of SQL Server’s relational database management system (RDBMS) functionality in support of analytics as well as transactional applications. Though it has not yet specifically optimized SSDS for analytics, Microsoft has stated that it plans to evolve the service in that direction.

As it becomes available from many service providers, DW 2.0 will offer an ever-expanding supply of cheap, plentiful analytic horsepower. Over the coming decade, software-as-a-service (SaaS) providers will begin to offer feature-complete, subscription-based BI/DW services for high-performance, high-volume, complex analytics. These clouds will leverage the full virtualized, distributed, scalable, grid-computing fabric that Microsoft, Google, and other SaaS behemoths can bring to bear on data mining, performance optimization, and other compute- and data-intensive tasks.

Over time, we’ll come to take DW 2.0 for granted. We’ll call it up on demand, a utility for processing any and all decision-support tasks, large or small, throughout the business world or in our daily lives.