Data Discovery And Exploration - IBM Acquires Vivisimo

Today IBM announced its plans to acquire Vivisimo - an enterprise search vendor with big data capabilities. Our research shows that only 1% to 5% of all enterprise data is in a structured, modeled format that fits neatly into enterprise data warehouses (EDWs) and data marts. The rest of enterprise data (and we are not even talking about external data such as social media data, for example) may not be organized into structures that easily fit into relational or multidimensional databases. There’s also a chicken-and-the-egg syndrome going on here. Before you can put your data into a structure, such as a database, you need to understand what’s out there and what structures do or may exist. But in order for you to explore the data in the first place, traditional data integration technologies require some structures to even start the exploration (tables, columns, etc). So how do you explore something without a structure, without a model, and without preconceived notions? That’s where big data exploration and discovery technologies such as Hadoop and Vivisimo come into play. (There are many others vendors in this space as well, including Oracle Endeca, Attivio, and Saffron Technology. While these vendors may not directly compete with Vivisimo and all use different approaches and architectures, the final objective - data discovery - is often the same.) Data exploration and discovery was one of our top 2012 business intelligence predictions. However, it’s only a first step in the full cycle of business intelligence and analytics. Once you discover a pattern using a product like Vivisimo, you may need to productionalize or persist your findings in a traditional DW and then build reports and dashboards for further analysis using traditional BI technologies. This is where IBM may be looking to integrate Vivisimo with its InfoSphere and Cognos products.

Comments

Agreed +

Boris,

Agree especially with the many different methods of data discovery-- and no doubt IBM is interested in integrating Vivisimo into current products, or create a new converged product.

I have never used Vivisimo myself but did review their public technology a couple of times with updates and was impressed at the time. However, strategic technology aside it's quite possible that IBM acquired the company at least as much for customers and talent as current technology. As long as it's pure speculation, the EU economy may have had an impact on timing. In addition, the techniques and technologies are maturing with significant convergence of acronyms beginning to occur--not just in vendor products but deep-pocketed, internal custom efforts to include impressive teams and budgets seeking competitive differentiation, and to avoid the negative aspects of commoditization. I am becoming more surprised at the latter recently, especially given the focus on the evaporation of IT by some analysts, adoption of cloud, and continued steady growth of many of the commoditized IT products in the enterprise -- believe it says far more about still untapped developing economies and low hanging fruit for incumbents than anything about innovation or the future. And nothing about competitive advantage.

With regard to data structure, I am confident estimates are low even if wiggle room is generous for what is considered semi-structured, and it's growing fairly rapidly, although still not necessarily across the entire organization. Most of the efforts underway we are aware of are protected by NDA-- in prospective partner discussions, private conversations with many smaller but top tier consultants and vendors, and direct with larger organizations.

Regarding modeling--I am working on an internal paper today under NDA for top tier partnership discussions--one line stands out in my mind about how data structure in adaptable EA is increasingly determining the fate of organizations, and even national economies. Missed opportunities to prevent enormous crisis with proper data structure and analytics has literally altered the course of history in the past decade. On an individual level quite a number of companies that no longer exist can be sourced directly to how they managed data--or didn't. It's a painful thing to observe and study.

To the best of my awareness very few boards seem to be aware of what's now possible, fewer still achieving rational potential, and within that minority the amount invested I've heard in totally customized platforms in the very large enterprise is just astronomical, and well beyond the ability of all but a very few, but as we know that too is changing very fast. Adaptive data management IP like ours does have the potential to change quite a bit about business models in the enterprise.

In terms of pure HPC horsepower, what the largest supercomputers in the world could achieve just a decade ago is now possible below the 7 figure range, and importantly due to adoption of highly structured data many functions beyond just finance are becoming possible that were impossible to achieve otherwise -- not just in scientific data, which is certainly true, but also organizational, performance, innovation, marketing, and strategy (algorithm and software efficiency is arguably as important now with perhaps even larger gains). Throw in sensors, AI techniques, and mobility and these are interesting times eh? Regards, MM

As always, great insights,

As always, great insights, thanks for contributing

Can 'Search' link ECM with structured world of BI?

Boris just a thought will it take more time to understand the unstructured data and convert to a structured format for better analysis...believe hence we are seeing move towards analysing data in its own existing format through Enterprise Search tools + a BI tool...also will it be interesting to enhance Enterprise Content Management platforms like Filenet with Enterprise Search functions and then have ECM (which then kind of has the requried info indexed for integration) integrate with the structured world for integrated analytics

Thanks
Pandian

Attivio vs Vivisimo

Boris,

Quick question, when Oracle bought Endeca you predicted IBM might by Attivio. I see Attivio being a broader product than Vivisimo in terms of Attivio having data model agnostic BI integratable and BI useable whereas I didn't see much BI relation to Vivisimo, though I could have missed it. Definitely see the enterprise search relation but not so much BI. Wonder what your thoughts on this were? Do you think Vivisimo and Attivio and are similar products from a BI perspective or just a search perspective and do you think acquiring Vivisimo was instead of Attivio or may be in addition to down the road? Curious to know your thoughts on this if you don't mind.

Thanks for the insight.

Jason

Agree that Attivio was a

Agree that Attivio was a better choice from the BI point of view, but i just don't think they are for sale.