There’s tug of war going on in the world of BI. On the one hand we have IT whose mission it is to manage and protect enterprise information assets, and on the other side there are end users who just want the data when they want it, and in the shape and form that they want it, without any limitations.
Traditional, mainstream BI vendors have catered primarily to IT target audience. These vendors will disagree, but take one look at their complex architectures, multiple layers and components, integration and support requirements, and you can’t help but agree that these are IT tools that can be used to create end user applications.
On the other hand I am seeing am emergence of smaller BI vendors that cater directly to the end users. They pitch simplicity, flexibility and little or no reliance on IT. True, these vendors do not have large enterprise functions like metadata, semantic layers, robust security and scalability, so I do not see them as enterprise-level, but rather departmental, focused solutions. Yet, the appeal to end users is undeniable.
Finding a compromise – satisfying all typical IT requirements, while empowering the end users - remains an elusive goal, and hence an opportunity for all BI vendors.
I’d like to hear what my colleagues out there think about the convergence of structured and unstructured data business intelligence. Here are the intersects as I see them. I see two types of BI paradigms emerging in the future:
Structured OLAP will continue to be just that – structured, as far as the process and UI are concerned. However, to become more effective, we will need to bring unstructured data into the analysis, in a way that is transparent to the end user. For example, as we are creating customer segmentation analysis for a marketing campaign, in addition to structured data such as customer demographics and prior buying behavior, we’d want to bring in comments hidden in customer email and voice mail requests. In an ideal environment, the OLAP engine will automatically match these emails to a customer dimension and quantify and qualify comments into star schema facts (number of requests) and dimensions (request types).
Combination of search and light-weight query used for ad-hoc research and analysis. Here, a familiar search text box should be the main UI, however, the engine should be smart enough to a) quantify and qualify unstructured results into facts and dimensions – a so called guided search, or b) recognize that the request is actually about data stored in a structured repository and automatically return search results via OLAP, cross tab or tabular report format.
For years I've been predicting that relational DBMS will run out of steam when it comes to effectively managing and manipulating very large, heterogeneous (structured + unstructured) data sets for business intelligence. First, RDBMS were never designed and optimized for unstructured data (not just XML, which is structured data in my definition, but truly unstructured text pages). Second, there's just too much overhead and cost in RDBMS for handling OLTP functions. The result: search index DBMS will be king in BI and DSS in the future.
Today’s announcements that Microsoft may be buying Yahoo came several weeks early. On May 17th I would’ve gone on the record at Forrester IT Forum in Nashville by saying the following, and I quote from my presentation paper: “DBMS/BI vendor may buy a search company, to address the trend of increasing importance of unstructured data in BI and to obtain an early leading position in the space. I know it should be Oracle or IBM, but it probably won’t, since these guys will never admit that their relational DBMS cannot do something. Microsoft is a more likely contender since they know they won’t leapfrog IBM or Oracle in relational DBMS and they could use this opportunity to stick one to Google too.”
I thought Microsoft would buy somebody like Fast Search, but I guess that was too small for them.
Remember George Costanza from a Seinfeld episode where he was pulling his hair out about “the two worlds colliding”? He was agonizing over the world of his girlfriends and the world of his friends that should never mix. In my world, process and data, separate disciplines until recently, are now “colliding”. While some of the vendors have already been toying with the convergence of both disciplines (IBM, Oracle, SAP), today’s announcement by Tibco that it will acquire a Spotfire, is the first transaction that will merge a pureplay middleware vendor with a pureplay BI vendor (a convergence that Forrester’s been predicting for almost a year, please see our Business Intelligence Meets BPM In The Information Workplace research document. But by acquiring Spotfire, Tibco has actually achieved more than one goal.
Being efficient is no longer enough. Enterprises can no longer stay competitive just by squeezing more efficiencies from operational applications, including workflow, business process management (BPM) and business rules engine (BRE) — business intelligence applications are needed to become more effective. For example, while workflow and rules are be used to efficiently process a customer credit application, Business Intelligence analytics are needed to effectively segment customer population and extend the credit offer to a much more targeted customer segment for a better response, cross-sell/up-sell ratios.
The actual convergence of process and data. The other slant is the natural interdependency of process and data from two angles: a) one needs data to feed and enrich the business process and process rules, and b) an event (an alert, for example) triggered by a data condition has to go into a process so that it can be followed up and acted on.
I’ve recently conducted research on the issues of VLDB (very large databases) and how it affects BI, since the challenges of reporting and analyzing Gb data sets are very different from reporting and analyzing multi Tb data sets. Among many other interesting findings and conclusions I uncovered the following approaches to handling VLDB challenges as they relate to BI:
Generic solutions by DBMS vendors:
Share everything vs. share nothing architecture
Caching, in-memory databases
Specialized file systems
Indexing (bit-map vs. B-tree)
BI-specific solutions by BI and other vendors:
ROLAP and reporting tool specific SQL optimization
Alternate DBMS (such as search indexes and vector DBMS)
I predict that for the foreseeable future, spreadsheets will remain the most widely used enterprise application. The widespread adoption isn't hard to understand — spreadsheets are powerful and flexible, yet intuitive and easy to use and learn. Plus, ad hoc applications and spreadsheet models isolate users from constant reliance on IT and incur low costs. Since the early days of BI, spreadsheets have played a natural and major role in the BI process/architecture, including:
I've been in the BI business for over 25 years so I've seen many ups and downs in the BI cycle. We are definitely in the "up" cycle these days. I see two main reasons for it:
Enterprises can no longer stay competitive just by squeezing more efficiencies from operational applications – business intelligence applications are needed to become more effective.
Digital data (structured and unstructured) volumes are growing at 30% a year, and will be reaching zetabyte sizes by year 2010 – that’s a number with 21 zeros! Solid BI implementations will be critical to successfully turn that data into useful information.
Does anyone have any comments on where we are in the BI cycle and what are some of the more recent key drivers?
Since Oracle really never competed toe to toe with IBM on applications and BI, the Hyperion acquisition is of a smaller significance for IBM than to other BI vendors. Watch for Oracle to acquire BEA, TIBCO or Informatica to leapfrog IBM in the EAI or middleware space.
It would be logical for IBM or SAP to pick up Cognos (not Business Objects, since it is still going through multiple product integration challenges) as the logical next large BI acquisition. SAP will probably make the first move, and once that happens, the IBM will look at Microstrategy or Information Builders as an alternative BI acquisition.
HP also clearly wants to be a BI player: they recently acquired a top boutique BI Systems Integrator, Knightsbridge, developed an integrated Data Warehousing platform – Neoview, and its NonStop database is used in some of the largest DW implementations. We would not be surprised if the next large BI acquisition comes from HP.
An orthogonal move could come from EMC or Sun, who have been Information Management players for years, with BI being a natural addition/extension. Notably absent from the rumors is Teradata, which in our opinion has to diversify into more layers of the BI “stack” beyond data warehousing to keep its competitive position.
A clear implication of this acquisition is for Oracle’s pureplay BI competitors: Cognos, Business Objects, Microstrategy, SAS and Information Builders, since a combined Oracle/Hyperion BI offering with best of breed components in every layer of the BI “stack” will become increasingly difficult to beat. While many of these vendors were quick to issue statements that they view this transaction as an "opportunity they intend to take advantage of" and that they remain "clear leaders" in the space, it is very clear that they are, as they should be, very concerned of Oracle's new position.
The transaction also has potentially huge implication for Microsoft, which has been giving away its OLAP product, Analysis Services, as part of SQLServer. While Oracle is also packaging OLAP (Express) with its relational database, it was always considered a lower end product to Microsoft. If Oracle decides to bundle Essbase as part of its overall database license, it could make significant cuts into Microsoft’s OLAP market share.
It is unclear whether Oracle will integrate, keep separate, or drop one of the clearly redundant products: multidimensional databases, Oracle Express (currently part of Oracle BI Server) and Essbase. However, if and when Oracle creates the same seamless integration they always had between its Express and relational databases with Essbase, it will truly become an awesome analytical database product hard to beat.
However, contrary to Oracle/Hyperion rosy statements of little if any product overlap, Oracle will face obvious and significant integration and product positioning challenges with multitude of overlapping and redundant products: Essbase vs. Express, Hyperion data integration and reporting tools (formerly Brio) vs. Oracle’s (including recently acquired Sunopsis), Hyperion Sales and Marketing Analytics vs. Siebel’s, plus some others.