I recently attended the second annual “Canonical Model Management Forum” at the Washington Plaza Hotel in Washington, DC (see here for my post about last year’s, first meeting, including Forrester’s definition of canonical modeling). Enterprise or information architects from a number of government agencies as well as several of the major banks, insurance companies, retailers, credit-card operators, and other private-sector firms attended the meeting. There was one vendor sponsor (DigitalML, the vendor of IgniteXML). There were a number of presentations by the attendees about their environments, what had motivated them to establish a canonical model, how that work had turned out, and the important lessons learned.
Last year I also had some recent Forrester survey results to share – we have not yet rerun that survey, but we are on the verge of rerunning it, so I’ll post some key results from that once the data is available.
Last year’s post is still the place to go to get the general overview about why to do canonical modeling, the main use cases, some areas of controversy (still raging), and a list of best practices I heard attendees agree upon.
What’s New In 2011?
Based both on what I heard at this meeting and on other recent interviews:
Forrester is in the middle of a major research effort on various Big Data-related topics. As part of this research, we’ll be kicking off a client survey shortly. I’d like to solicit everyone’s input on the survey questions and answer options. Here’s the first draft. What am I missing?
Scope. What is the scope of your Big Data initiative?
Status. What is the status of your Big Data initiative?
Industry. Are the questions you are trying to address with your Big Data initiative general or industry-specific?
Domains. What enterprise areas does your Big Data initiative address?
Why BigData? What are the main business requirements or inadequacies of earlier-generation BI/DW/ET technologies, applications, and architecture that are causing you to consider or implement Big Data?
Velocity of change and scope/requirements unpredictability
Analysis-driven requirements (Big Data) vs. requirements-driven analysis (traditional BI/DW)
Cost. Big Data solutions are less expensive than traditional ETL/DW/BI solutions
“Big Data” is coming up more often on the agendas of key vendors as well as some of the more-advanced users of information management technology. Although some of this increased activity reflects PR calendars – companies promote new offerings in the Spring – there’s more than that going on. The range of design patterns that fall under this large umbrella are genuinely on the increase in a wider range of usage scenarios, driving continuing innovation from both technology providers and users. In part because of the frequent use of open source technology such as Apache Hadoop to implement “Big Data,” this is the type of innovation the industry most needs at this early stage of the market. A few key data points:
Just attended a Big Data symposium courtesy of IBM and thought I’d share a few insights, as probably many of you have heard the term but are not sure what it means to you.
No. 1: Big Data is about looking out of the front window when you drive, not the rearview mirror. What do I mean? The typical decision-making process goes something like this: capture some data, integrate it together, analyze the clean and integrated data, make some decisions, execute. By the time you decide and execute, the data may be too old and have cost you too much. It’s a bit like driving by looking out of your rearview mirror.
Big Data changes this paradigm by allowing you to iteratively sift through data at extreme scale in the wild and draw insights closer to real time. This is a very good thing, and companies that do it well will beat those that don’t.
No. 2: Big is not just big volume. The term “Big Data” is a misnomer and it is causing some confusion. Several of us here at Forrester have been saying for a while that it is about the four “V’s" of data at extreme scale - volume, velocity, variety and variability. I was relieved when IBM came up with three of them; variability being the one they left out.
Some of the most interesting examples we discussed centered on the last 3 V’s – we heard from a researcher who is collecting data on vital signs from prenatal babies and correlating changes in heart rates with early signs of infection. According to her, they collect 90 million data points per patient per day! What do you do with that stream of information? How do you use it to save lives? It is a Big Data Problem.