No self-respecting EA professional would enter into planning discussions with business or tech management execs without a solid grasp of the technologies available to the enterprise, right? But what about the data available to the enterprise? Given the shift towards data-driven decision-making and the clear advantages from advanced analytics capabilities, architecture professionals should be coming to the planning table with not only an understanding of enterprise data, but a working knowledge of the available third-party data that could have significant impact on your approach to customer engagement or your B2B partner strategy.
Data discussions can't be simply about internal information flow, master data, and business glossaries any more. Enterprise architects, business architects, and information architects working with business execs on tech-enabled strategies need to bring third-party data know-how to their brainstorming and planning discussions. As the data economy is still in its relatively early stages and, more to the point, as organizational responsibilities for sourcing, managing, and governing third-party data are still in their formative states, it behooves architects to take the lead in understanding the data economy in some detail. By doing so, architects can help their organizations find innovative approaches to data and analytics that have direct business impact by improving the customer experience, making your partner ecosystem more effective, or finding new revenue from data-driven products.
Coming back from the SAS Industry Analyst Event left me with one big question - Are we taking into account the recommendations or insights provided through analysis and see if they actually produced positive or negative results?
It's a big question for data governance that I'm not hearing discussed around the table. We often emphsize how data is supplied, but how it performs in it's consumed state is fogotten.
When leading business intelligence and analytics teams I always pushed to create reports and analysis that ultimately incented action. What you know should influence behavior and decisions, even if the influence was to say, "Don't change, keep up the good work!" This should be a fundamental function of data govenance. We need to care not only that the data is in the right form factor but also review what the data tells us/or how we interpret the data and did it make us better?
I've talked about the closed-loop from a master data management perspective - what you learn about customers will alter and enrich the customer master. The connection to data governance is pretty clear in this case. However, we shouldn't stop at raw data and master definitions. Our attention needs to include the data business users receive and if it is trusted and accurate. This goes back to the fact that how the business defines data is more than what exists in a database or application. Data is a total, a percentage, an index. This derived data is what the business expects to govern - and if derived data isn't supporting business objectives, that has to be incorporated into the data governance discussion.
When it comes to data investment, data management is still asking the wrong questions and positioning the wrong value. The mantra of - It's About the Business - is still a hard lesson to learn. It translates into what I see as the 7 Deadly Sins of Data Management. Here are the are - not in any particular order - and an example:
Hubris: "Business value? Yeah, I know. Tell me something I don't know."
Blindness: "We do align to business needs. See, we are building a customer master for a 360 degree view of the customer."
Vanity: "How can I optimize cost and efficiency to manage and develop data solutions?"
Gluttony: "If I build this cool solutions the business is gonna love it!"
Alien: "We need to develop an in-memory system to virtualize data and insight that materializes through business services with our application systems...[blah, blah, blah]"
Begger: "If only we were able to implement a business glossary, all our consistency issues are solved!"
Educator: "If only the business understood! I need to better educate them!."
Hadoop’s momentum is unstoppable as its open source roots grow wildly into enterprises. Its refreshingly unique approach to data management is transforming how companies store, process, analyze, and share big data. Forrester believes that Hadoop will become must-have infrastructure for large enterprises. If you have lots of data, there is a sweet spot for Hadoop in your organization. Here are five reasons firms should adopt Hadoop today:
Build a data lake with the Hadoop file system (HDFS). Firms leave potentially valuable data on the cutting-room floor. A core component of Hadoop is its distributed file system, which can store huge files and many files to scale linearly across three, 10, or 1,000 commodity nodes. Firms can use Hadoop data lakes to break down data silos across the enterprise and commingle data from CRM, ERP, clickstreams, system logs, mobile GPS, and just about any other structured or unstructured data that might contain previously undiscovered insights. Why limit yourself to wading in multiple kiddie pools when you can dive for treasure chests at the bottom of the data lake?
Enjoy cheap, quick processing with MapReduce. You’ve poured all of your data into the lake — now you have to process it. Hadoop MapReduce is a distributed data processing framework that brings the processing to the data in a highly parallel fashion to process and analyze data. Instead of serially reading data from files, MapReduce pushes the processing out to the individual Hadoop nodes where the data resides. The result: Large amounts of data can be processed in parallel in minutes or hours rather than in days. Now you know why Hadoop’s origins stem from monstrous data processing use cases at Google and Yahoo.
For decades, firms have deployed applications and BI on independent databases and warehouses, supporting custom data models, scalability, and performance while speeding delivery. It’s become a nightmare to try to integrate the proliferation of data across these sources in order to deliver the unified view of business data required to support new business applications, analytics, and real-time insights. The explosion of new sources, driven by the triple-threat trends of mobile, social, and the cloud, amplified by partner data, market feeds, and machine-generated data, further aggravates the problem. Poorly integrated business data often leads to poor business decisions, reduces customer satisfaction and competitive advantage, and slows product innovation — ultimately limiting revenue.
Forrester’s latest research reveals how leading firms are coping with this explosion using data virtualization, leading us to release a major new version of our reference architecture, Information Fabric 3.0. Since Forrester invented the category of data virtualization eight years ago with the first version of information fabric, these solutions have continued to evolve. In this update, we reflect new business requirements and new technology options including big data, cloud, mobile, distributed in-memory caching, and dynamic services. Use information fabric 3.0 to inform and guide your data virtualization and integration strategy, especially where you require real-time data sharing, complex business transactions, more self-service access to data, integration of all types of data, and increased support for analytics and predictive analytics.
Information fabric 3.0 reflects significant innovation in data virtualization solutions, including:
I had a conversation recently with Brian Lent, founder, chairman, and CTO of Medio. If you don’t know Brian, he has worked with companies such as Google and Amazon to build and hone their algorithms and is currently taking predictive analytics to mobile engagement. The perspective he brings as a data scientist not only has ramifications for big data analytics, but drastically shifts the paradigm for how we architect our master data and ensure quality.
We discussed big data analytics in the context of behavior and engagement. Think shopping carts and search. At the core, analytics is about the “closed loop.” It is, as Brian says, a rinse and repeat cycle. You gain insight for relevant engagement with a customer, you engage, then you take the results of that engagement and put them back into the analysis.
Sounds simple, but think about what that means for data management. Brian provided two principles:
In advance of next week’s Forrester’s European Business Technology Forums in London on June 10 and 11, we had an opportunity to speak with Greg Swimer about information management and how Unilever delivers real-time data to its employees. Greg Swimer is a global IT leader at Unilever, responsible for delivering new information management, business intelligence, reporting, consolidation, analytics, and master data solutions to more than 20,000 users across all of Unilever’s businesses globally.
1) What are the two forces you and the Unilever team are balancing with your “Data At Your Fingertips” vision?
Putting the data at Unilever’s fingertips means working on two complementary aspects of information management. One aspect is to build an analytics powerhouse with the capacity to handle big data, providing users with the technological power to analyse that data in order to gain greater insight and drive better decision-making. The other aspect is the importance of simplifying and standardizing that data so that it’s accessible enough to understand and act upon. We want to create a simplified landscape, one that allows better decisions, in real time, where there is a common language and a great experience for users.
2) What keys to success have you uncovered in your efforts?
There are multiple maturity models and associated assessments for Data Governance on the market. Some are from software vendors, or from consulting companies, which use these as the basis for selling services. Others are from professional groups like the one from the Data Governance Council.
They are all good – but frankly not adequate for the data economy many companies are entering into. I think it is useful to reshuffle some too well established ideas...
Maturity models in general are attractive because:
- Using a maturity model is nearly a ‘no-brainer’ exercise. You run an assessment and determine your current maturity level. Then you can make a list of the actions which will drive you to the next level. You do not need to ask your business for advice, nor involve too many people for interviews.
- Most data governance maturity models are modeled on the very well known CMMI. That means that they are similar at least in terms of structure/levels. So the debate between the advantages of one vs another is limited to its level of detail.
But as firms move into the data economy – with what this means for their sourcing, analyzing and leveraging data, I think that today’s maturity models for data governance are becoming less relevant – and even an impediment:
I met with a group of clients recently on the evolution of data management and big data. One retailer asked, “Are you seeing the business going to external sources to do Big Data?”
My first reaction was, “NO!” Yet, as I thought about it more and went back to my own roots as an analyst, the answer is most likely, “YES!”
Ignoring nomenclature, the reality is that the business is not only going to external sources for big data, but they have been doing it for years. Think about it; organizations that have considered data a strategic tool have invested heavily in big data going back to when mainframes came into vogue. More recently, banking, retail, consumer packaged goods, and logistics have marquis case studies on what sophisticated data use can do.
Before Hadoop, before massive parallel processing, where did the business turn? Many have had relationships with market research organizations, consultancies, and agencies to get them the sophisticated analysis that they need.
Think about the fact, too, that at the beginning of social media, it was PR agencies that developed the first big data analysis and visualization of Twitter, LinkedIn, and Facebook influence. In a past life, I worked at ComScore Networks, an aggregator and market research firm analyzing and trending online behavior. When I joined, they had the largest and fastest growing private cloud to collect web traffic globally. Now, that was big data.
Today, the data paints a split picture. When surveying IT across various surveys, social media and online analysis is a small percentage of business intelligence and analytics that is supported. However, when we look to the marketing and strategy clients at Forrester, there is a completely opposite picture.