I met with a group of clients recently on the evolution of data management and big data. One retailer asked, “Are you seeing the business going to external sources to do Big Data?”
My first reaction was, “NO!” Yet, as I thought about it more and went back to my own roots as an analyst, the answer is most likely, “YES!”
Ignoring nomenclature, the reality is that the business is not only going to external sources for big data, but they have been doing it for years. Think about it; organizations that have considered data a strategic tool have invested heavily in big data going back to when mainframes came into vogue. More recently, banking, retail, consumer packaged goods, and logistics have marquis case studies on what sophisticated data use can do.
Before Hadoop, before massive parallel processing, where did the business turn? Many have had relationships with market research organizations, consultancies, and agencies to get them the sophisticated analysis that they need.
Think about the fact, too, that at the beginning of social media, it was PR agencies that developed the first big data analysis and visualization of Twitter, LinkedIn, and Facebook influence. In a past life, I worked at ComScore Networks, an aggregator and market research firm analyzing and trending online behavior. When I joined, they had the largest and fastest growing private cloud to collect web traffic globally. Now, that was big data.
Today, the data paints a split picture. When surveying IT across various surveys, social media and online analysis is a small percentage of business intelligence and analytics that is supported. However, when we look to the marketing and strategy clients at Forrester, there is a completely opposite picture.
I recently had a client ask about MDM measurement for their customer master. In many cases, the discussions I have about measurement is how to show that MDM has "solved world hunger" for the organization. In fact, a lot of the research and content out there focused on just that. Great to create a business case for investment. Not so good in helping with the daily management of master data and data governance. This client question is more practical, touching upon:
what about the data do you measure?
how do you calculate?
how frequently do you report and show trends?
how do you link the calculation to something the business understands?
Joining in on the spirit of all the 2013 predictions, it seems that we shouldn't leave data quality out of the mix. Data quality may not be as sexy as big data has been this past year. The technology is mature and reliable. The concept easy to understand. It is also one of the few areas in data management that has a recognized and adopted framework to measure success. (Read Malcolm Chisholm's blog on data quality dimensions) However, maturity shouldn't create complancency. Data quality still matters, a lot.
Yet, judgement day is here and data quality is at a cross roads. It's maturity in both technology and practice is steeped in an old way of thinking about and managing data. Data quality technology is firmly seated in the world of data warehousing and ETL. While still a significant portion of an enterprise data managment landscape, the adoption and use in business critical applications and processes of in-memory, Hadoop, data virtualization, streams, etc means that more and more data is bypassing the traditional platform.
The options to manage data quality are expanding, but not necessarily in a way that ensures that data can be trusted or complies with data policies. Where data quality tools have provided value is in the ability to have a workbench to centrally monitor, create and manage data quality processes and rules. They created sanity where ETL spaghetti created chaos and uncertainty. Today, this value proposition has diminished as data virtualization, Hadoop processes, and data appliances create and persist new data quality silos. To this, these data quality silos often do not have the monitoring and measurement to govern data. In the end, do we have data quality? Or, are we back where we started from?
Security and privacy have always been at the core of data governance. Typically, company policies, processes, and procedures have been designed to comply with these regulations to avoid fines and in some cases jail time. Very internally focused. However, companies now operate in a more external and connected fashion then ever before.
Let's consider this. Two stories in the news have recently exposed an aspect of data governance that muddies the water on our definition of data ownership and responsibility. After the tragedy at Sandy Hook Elementary School, the Journal News combined gun owner data with a map and released it to the public causing speculation and outcry that it provided criminals information to get the guns and put owners at risk. A more recent posting of a similar nature, an MIT graduate student creates an interactive map that lets you find individuals across the US and Canada to help people feel a part of something bigger. My first reaction was to think this was a better stalker tool than social media.
Why is this game changing for data governance and why should you care? It begs us to ask, even if a regulation is not hanging over our head, what is the ethical use of data and what is the responsibility of businesses to use this data?
Technology is moving faster than policy and laws can be created to keep up with this change. The owners of data more often than not will sit outside your corporate walls. Data governance has to take into account not only the interests of the company, but also the interests of the data owners. Data stewards have to be the trusted custodians of the data. Companies have to consider policies that not only benefit the corporate welfare but also the interests of customer and partners or face reputational risk and potential loss of business.
The number one question I get from clients regarding their data strategy and data governance is, “How do I create a business case?”
This question is the kiss of death and here is why.
You created an IT strategy that has placed emphasis on helping to optimize IT data management efforts, lower total cost of ownership and reduce cost, and focused on technical requirements to develop the platform. There may be a nod toward helping the business by highlighting the improvement in data quality, consistency, and management of access and security in broad vague terms. The data strategy ended up looking more like an IT plan to execute data management.
This leaves the business asking, “So what? What is in it for me?”
Rethink your approach and think like the business:
· Change your data strategy to a business strategy. Recognize the strategy, objectives, and capabilities the business is looking for related to key initiatives. Your strategy should create a vision for how data will make these business needs a reality.
· Stop searching for the business case. The business case should already exist based on project requests at a line of business and executive level. Use the input to identify a strategy and solution that supports these requests.
· Avoid “shiny object syndrome”. As you keep up with emerging technology and trends, keep these new solutions and tools in context. There are more data integration, database, data governance, and storage options than ever before and one size does not fit all. Leverage your research to identify the right technology for business capabilities.
There was lots of feedback on the last blog (“Risk Data, Risky Business?”) that clearly indicates the divide between definitions in trust and quality. It is a great jumping off point for the next hot topic, data governance for big data.
The comment I hear most from clients, particularly when discussing big data, is, “Data governance inhibits agility.” Why be hindered by committees and bureaucracy when you want freedom to experiment and discover?
Current thinking: Data governance is freedom from risk.The stakes are high when it comes to data-intensive projects, and having the right alignment between IT and the business is crucial. Data governance has been the gold standard to establish the right roles, responsibilities, processes, and procedures to deliver trusted secure data. Success has been achieved through legislative means by enacting policies and procedures that reduce risk to the business from bad data and bad data management project implementation. Data governance was meant to keep bad things from happening.
Today’s data governance approach is important and certainly has a place in the new world of big data. When data enters the inner sanctum of an organization, management needs to be rigorous.
Yet, the challenge is that legislative data governance by nature is focused on risk avoidance. Often this model is still IT led. This holds progress back as the business may be at the table, but it isn’t bought in. This is evidenced by committee and project management style data governance programs focused on ownership, scope, and timelines. All this management and process takes time and stifles experimentation and growth.
So, this blog is dedicated to stepping outside the comfort zone once again and into the world of chaos. Not only may you not want to persist in your data quality transformations, but you may not want to cleanse the data.
Current thinking: Purge poor data from your environment. Put the word “risk” in the same sentence as data quality and watch the hackles go up on data quality professionals. It is like using salt in your coffee instead of sugar. However, the biggest challenge I see many data quality professionals face is getting lost in all the data due to the fact that they need to remove risk to the business caused by bad data. In the world of big data, clearly you are not going to be able to cleanse all that data. A best practice is to identify critical data elements that have the most impact on the business and focus efforts there. Problem solved.
Not so fast. Even scoping the data quality effort may not be the right way to go. The time and effort it takes as well as the accessibility of the data may not meet business needs to get information quickly. The business has decided to take the risk, focusing on direction rather than precision.
As the new analyst on the block at Forrester, the first question everyone is asking is, “What research do you have planned?” Just to show that I’m up for the task, rather than keeping it simple with a thoughtful report on data quality best practices or a maturity assessment on data management, I thought I’d go for broke and dive into the master data management (MDM) landscape. Some might call me crazy, but this is more than just the adrenaline rush that comes from doing such a project. In over 20 inquiries with clients in the past month, questions show increased sophistication in how managing master data can strategically contribute to the business.
What do I mean by this?
Number 1: Clients want to know how to bring together transitional data (structured) and content (semi-structured and unstructured) to understand the customer experience, improve customer engagement, and maximize the value of the customer. Understanding customer touch points across social media, e-commerce, customer service, and content consumption provides a single customer view that lets you customize your interactions and be highly relevant to your customer. MDM is at the heart of bringing this view together.
Number 2: Clients have begun to analyze big data within side projects as a way to identify opportunities for the business. This intelligence has reached the point that clients are now exploring how to distribute and operationalize these insights throughout the organization. MDM is the point that will align discoveries within the governance of master data for context and use.
Broadens the definition of metadata beyond “data on data” to include business rules, process models, application parameters, application rights, and policies.
Provides guidance to help evangelize to the business the importance of metadata, not by talking about metadata but by pointing out the value it provides against risks.
Recommends demonstrating to IT the transversality of metadata to IT internal siloed systems.
Advocates extending data governance to include metadata. The main impact of data governance should be to build the life cycle for metadata, but data governance evangelists reserve little concern for metadata at this point.
I will co-author the next document on metadata with Gene Leganza; this document will develop the next practice metadata architecture based partially but not only on a metadata exchange infrastructure. For a lot of people, metadata architecture is a Holy Grail. The upcoming document will demonstrate that metadata architecture will become an important step to ease the trend called “industrialization of IT,” sometimes also called “ERP for IT” or “Lean IT.”
In preparation for this upcoming document, please share with us your own experiences in bringing more attention to metadata.
The following question comes from many of our clients: what are some of the advantages and risks of implementing a vendor provided analytical logical data model at the start of any Business Intelligence, Data Warehousing or other Information Management initiatives? Some quick thoughts on pros and cons:
Leverage vendor knowledge from prior experience and other customers
May fill in the gaps in enterprise domain knowledge
Best if your IT dept does not have experienced data modelers
May sometimes serve as a project, initiative, solution accelerator
May sometimes break through a stalemate between stakeholders failing to agree on metrics, definitions
May sometimes require more customization effort, than building a model from scratch
May create difference of opinion arguments and potential road blocks from your own experienced data modelers
May reduce competitive advantage of business intelligence and analytics (since competitors may be using the same model)
Goes against “agile” BI principles that call for small, quick, tangible deliverables
Goes against top down performance management design and modeling best practices, where one does not start with a logical data model but rather
Defines departmental, line of business strategies
Links goals and objectives needed to fulfill these strategies
Defines metrics needed to measure the progress against goals and objectives
Defines strategic, tactical and operational decisions that need to be made based on metrics