Coming back from the SAS Industry Analyst Event left me with one big question - Are we taking into account the recommendations or insights provided through analysis and see if they actually produced positive or negative results?
It's a big question for data governance that I'm not hearing discussed around the table. We often emphsize how data is supplied, but how it performs in it's consumed state is fogotten.
When leading business intelligence and analytics teams I always pushed to create reports and analysis that ultimately incented action. What you know should influence behavior and decisions, even if the influence was to say, "Don't change, keep up the good work!" This should be a fundamental function of data govenance. We need to care not only that the data is in the right form factor but also review what the data tells us/or how we interpret the data and did it make us better?
I've talked about the closed-loop from a master data management perspective - what you learn about customers will alter and enrich the customer master. The connection to data governance is pretty clear in this case. However, we shouldn't stop at raw data and master definitions. Our attention needs to include the data business users receive and if it is trusted and accurate. This goes back to the fact that how the business defines data is more than what exists in a database or application. Data is a total, a percentage, an index. This derived data is what the business expects to govern - and if derived data isn't supporting business objectives, that has to be incorporated into the data governance discussion.
When it comes to data investment, data management is still asking the wrong questions and positioning the wrong value. The mantra of - It's About the Business - is still a hard lesson to learn. It translates into what I see as the 7 Deadly Sins of Data Management. Here are the are - not in any particular order - and an example:
Hubris: "Business value? Yeah, I know. Tell me something I don't know."
Blindness: "We do align to business needs. See, we are building a customer master for a 360 degree view of the customer."
Vanity: "How can I optimize cost and efficiency to manage and develop data solutions?"
Gluttony: "If I build this cool solutions the business is gonna love it!"
Alien: "We need to develop an in-memory system to virtualize data and insight that materializes through business services with our application systems...[blah, blah, blah]"
Begger: "If only we were able to implement a business glossary, all our consistency issues are solved!"
Educator: "If only the business understood! I need to better educate them!."
Hadoop’s momentum is unstoppable as its open source roots grow wildly into enterprises. Its refreshingly unique approach to data management is transforming how companies store, process, analyze, and share big data. Forrester believes that Hadoop will become must-have infrastructure for large enterprises. If you have lots of data, there is a sweet spot for Hadoop in your organization. Here are five reasons firms should adopt Hadoop today:
Build a data lake with the Hadoop file system (HDFS). Firms leave potentially valuable data on the cutting-room floor. A core component of Hadoop is its distributed file system, which can store huge files and many files to scale linearly across three, 10, or 1,000 commodity nodes. Firms can use Hadoop data lakes to break down data silos across the enterprise and commingle data from CRM, ERP, clickstreams, system logs, mobile GPS, and just about any other structured or unstructured data that might contain previously undiscovered insights. Why limit yourself to wading in multiple kiddie pools when you can dive for treasure chests at the bottom of the data lake?
Enjoy cheap, quick processing with MapReduce. You’ve poured all of your data into the lake — now you have to process it. Hadoop MapReduce is a distributed data processing framework that brings the processing to the data in a highly parallel fashion to process and analyze data. Instead of serially reading data from files, MapReduce pushes the processing out to the individual Hadoop nodes where the data resides. The result: Large amounts of data can be processed in parallel in minutes or hours rather than in days. Now you know why Hadoop’s origins stem from monstrous data processing use cases at Google and Yahoo.
For decades, firms have deployed applications and BI on independent databases and warehouses, supporting custom data models, scalability, and performance while speeding delivery. It’s become a nightmare to try to integrate the proliferation of data across these sources in order to deliver the unified view of business data required to support new business applications, analytics, and real-time insights. The explosion of new sources, driven by the triple-threat trends of mobile, social, and the cloud, amplified by partner data, market feeds, and machine-generated data, further aggravates the problem. Poorly integrated business data often leads to poor business decisions, reduces customer satisfaction and competitive advantage, and slows product innovation — ultimately limiting revenue.
Forrester’s latest research reveals how leading firms are coping with this explosion using data virtualization, leading us to release a major new version of our reference architecture, Information Fabric 3.0. Since Forrester invented the category of data virtualization eight years ago with the first version of information fabric, these solutions have continued to evolve. In this update, we reflect new business requirements and new technology options including big data, cloud, mobile, distributed in-memory caching, and dynamic services. Use information fabric 3.0 to inform and guide your data virtualization and integration strategy, especially where you require real-time data sharing, complex business transactions, more self-service access to data, integration of all types of data, and increased support for analytics and predictive analytics.
Information fabric 3.0 reflects significant innovation in data virtualization solutions, including:
I had a conversation recently with Brian Lent, founder, chairman, and CTO of Medio. If you don’t know Brian, he has worked with companies such as Google and Amazon to build and hone their algorithms and is currently taking predictive analytics to mobile engagement. The perspective he brings as a data scientist not only has ramifications for big data analytics, but drastically shifts the paradigm for how we architect our master data and ensure quality.
We discussed big data analytics in the context of behavior and engagement. Think shopping carts and search. At the core, analytics is about the “closed loop.” It is, as Brian says, a rinse and repeat cycle. You gain insight for relevant engagement with a customer, you engage, then you take the results of that engagement and put them back into the analysis.
Sounds simple, but think about what that means for data management. Brian provided two principles:
In advance of next week’s Forrester’s European Business Technology Forums in London on June 10 and 11, we had an opportunity to speak with Greg Swimer about information management and how Unilever delivers real-time data to its employees. Greg Swimer is a global IT leader at Unilever, responsible for delivering new information management, business intelligence, reporting, consolidation, analytics, and master data solutions to more than 20,000 users across all of Unilever’s businesses globally.
1) What are the two forces you and the Unilever team are balancing with your “Data At Your Fingertips” vision?
Putting the data at Unilever’s fingertips means working on two complementary aspects of information management. One aspect is to build an analytics powerhouse with the capacity to handle big data, providing users with the technological power to analyse that data in order to gain greater insight and drive better decision-making. The other aspect is the importance of simplifying and standardizing that data so that it’s accessible enough to understand and act upon. We want to create a simplified landscape, one that allows better decisions, in real time, where there is a common language and a great experience for users.
2) What keys to success have you uncovered in your efforts?
There are multiple maturity models and associated assessments for Data Governance on the market. Some are from software vendors, or from consulting companies, which use these as the basis for selling services. Others are from professional groups like the one from the Data Governance Council.
They are all good – but frankly not adequate for the data economy many companies are entering into. I think it is useful to reshuffle some too well established ideas...
Maturity models in general are attractive because:
- Using a maturity model is nearly a ‘no-brainer’ exercise. You run an assessment and determine your current maturity level. Then you can make a list of the actions which will drive you to the next level. You do not need to ask your business for advice, nor involve too many people for interviews.
- Most data governance maturity models are modeled on the very well known CMMI. That means that they are similar at least in terms of structure/levels. So the debate between the advantages of one vs another is limited to its level of detail.
But as firms move into the data economy – with what this means for their sourcing, analyzing and leveraging data, I think that today’s maturity models for data governance are becoming less relevant – and even an impediment:
I met with a group of clients recently on the evolution of data management and big data. One retailer asked, “Are you seeing the business going to external sources to do Big Data?”
My first reaction was, “NO!” Yet, as I thought about it more and went back to my own roots as an analyst, the answer is most likely, “YES!”
Ignoring nomenclature, the reality is that the business is not only going to external sources for big data, but they have been doing it for years. Think about it; organizations that have considered data a strategic tool have invested heavily in big data going back to when mainframes came into vogue. More recently, banking, retail, consumer packaged goods, and logistics have marquis case studies on what sophisticated data use can do.
Before Hadoop, before massive parallel processing, where did the business turn? Many have had relationships with market research organizations, consultancies, and agencies to get them the sophisticated analysis that they need.
Think about the fact, too, that at the beginning of social media, it was PR agencies that developed the first big data analysis and visualization of Twitter, LinkedIn, and Facebook influence. In a past life, I worked at ComScore Networks, an aggregator and market research firm analyzing and trending online behavior. When I joined, they had the largest and fastest growing private cloud to collect web traffic globally. Now, that was big data.
Today, the data paints a split picture. When surveying IT across various surveys, social media and online analysis is a small percentage of business intelligence and analytics that is supported. However, when we look to the marketing and strategy clients at Forrester, there is a completely opposite picture.
When I posted a blog on Don’t Establish Data Management Standards (it was also on Information Management's website as Data Management Standards are a Barrier) I expected some resistance. I mean, why post a blog and not have the courage to be provocative, right? However, I have to say I was surprised at the level of resistance. Although, I also have to point out that this blog was also one of the most syndicated and recommended I have had. I will assume that there is a bit of an agreement with it as well as I didn't see any qualifiers in tweets that I was completely crazy. Anyway, here are just a few dissenter comments:
“This article would be funny if it wasn't so sad...you can't do *anything* in IT (especially innovate) without standing on the shoulders of some standard.” – John O
“Show me data management without standards and good process to review and update them and I'll show you the mortgage crisis which developed during 2007.” – Jim F
“This article is alarmingly naive, detrimental, and counterproductive. Let me count the ways…” – Cynthia H
"No control leads to caos... I would be amused to watch the reaction of the ISO engineer while reading this article :)." - Eduardo G (I would too!)
After wiping the rotten tomatoes from my face from that, here are some points made that get to the nuance I was hoping to create a discussion on:
A recent survey of Enterprise Architects showed a lack of standards for data management.* Best practices has always been about the creation of standards for IT, which would lead us to think that lack of standards for data management is a gap.
Not so fast.
Standards can help control cost. Standards can help reduce complexity. But, in an age when a data management architecture needs to flex and meet the business need for agility, standards are a barrier. The emphasis on standards is what keeps IT in a mode of constant foundation building, playing the role of deli counter, and focused on cost management.
In contrast, when companies throw off the straight jacket of data management standards the are no longer challenged by the foundation. These organizations are challenged by ceilings. Top performing organizations, those that have had annual growth above 15%, are working to keep the dam open and letting more data in and managing more variety. They are pushing the envelope on the technology that is available.
Think about this. Overall, organizations have made similar data management technology purchases. What has separated top performers from the rest of organizations is by not being constrained. Top performers maximize and master the technology they invest in. They are now better positioned to do more, expand their architecture, and ultimately grow data value. For big data, they have or are getting ready to step out of the sandbox. Other organizations have not seen enough value to invest more. They are in the sand trap.
Standards can help structure decisions and strategy, but they should never be barriers to innovation.
*203 Enterprise Architecture Professionals, State of Enterprise Architecture Global Survey Month,2012
**Top performer organization analysis based on data from Forrsights Strategy Spotlight BI And Big Data, Q4 2012