MDM tools today don't look like your father's MDM. No longer an integration hub between applications and DBMSs, today's tools are transitioning or have reinvented MDM to handle the context missing from system traditional implementations. Visualizations, graph repositories, big data and cloud scale, along with application like interfaces for nontechnical users, mean MDM and master data gets personal with stakeholders.
Semantics and insight are not an outcome of MDM but an integrated part of the engine and hub. Three MDM evolutions stand out:
Business-defined views of data: For graph-based vendors such as Reltio and Pitney Bowes, master domains are shaped by business use cases. For example, customer master can be defined beyond the bounds of a household, identity, and account. Customer behavioral characteristics can be the starting points for taxonomies and hierarchies. Integration of master domains is based on physical, logical, linkage, and semantic schemas for a more seamless navigation and querying of master data to align with the explosion of data views created by analytics, applications, and microservices.
Last week, I participated in a roundtable during a conference in Paris organized by the French branch of DAMA, the data management international organization. During the question/answer part of the conference, it became clear that most of the audience was confusing data management with data governance (DG). This is a challenge my Forrester colleague Michele Goetz identified early in the DG tooling space. Because data quality and master data management embed governance features, many view them as data governance tooling. But the reality is that they remain data management tooling — their goal is to improve data quality by executing rules. This tooling confusion is only a consequence of how much the word governance is misused and misunderstood, and that leads to struggling data governance efforts.
So what is “governance”? Governance is the collaboration, organization, and metrics facilitating a decision path between at least two conflicting objectives. Governance is finding the acceptable balance between the interests of two parties. For example, IT governance is needed when you would like to support all possible business projects but you have limited budget, skills, or resources available. Governance is needed when objectives are different for different stakeholders, and the outcome of governance is that they do not get the same priority. If everyone has the same objective, then this is data management.
You can't bring up semantics without someone inserting an apology for the geekiness of the discussion. If you're a data person like me, geek away! But for everyone else, it's a topic best left alone. Well, like every geek, the semantic geeks now have their day — and may just rule the data world.
It begins with a seemingly innocent set of questions:
"Is there a better way to master my data?"
"Is there a better way to understand the data I have?"
"Is there a better way to bring data and content together?"
"Is there a better way to personalize data and insight to be relevant?"
Semantics discussions today are born out of the data chaos that our traditional data management and governance capabilities are struggling under. They're born out of the fact that even with the best big data technology and analytics being adopted, business stakeholder satisfaction with analytics has decreased by 21% from 2014 to 2015, according to Forrester's Global Business Technographics® Data And Analytics Survey, 2015. Innovative data architects and vendors realize that semantics is the key to bringing context and meaning to our information so we can extract those much-needed business insights, at scale, and more importantly, personalized.
First there was Hadoop. Then there were data scientists. Then came Agile BI on big data. Drum roll, please . . . bum, bum, bum, bum . . .
Now we have data preparation!
If you are as passionate about data quality and governance and I am, then the 5+-year wait for a scalable capability to take on data trust is amazingly validating. The era for "good enough" when it comes to big data is giving way to an understanding that the way analysts have gotten away with "good enough" was through a significant amount of manual data wrangling. As an analyst, it must have felt like your parents saying you can't see your friends and play outside until you cleaned your room (and if it's anything like my kids' rooms, that's a tall order).
There is no denying that analysts are the first to benefit from data preparation tools such as Altyrex, Paxata, and Trifacta. It's a matter of time to value for insight. What is still unrecognized in the broader data management and governance strategy is that these early forays are laying the foundation for data citizenry and the cultural shift toward a truly data-driven organization.
Today's data reality is that consumers of data are like any other consumers; they want to shop for what they need. This data consumer journey begins by looking in their own spreadsheets, databases, and warehouses. When they can't find what they want there, data consumers turn to external sources such as partners, third parties, and the Web. Their tool to define the value of data, and ultimately if they will procure it and possibly pay for it, is what data preparation tools help with. The other outcome of this data-shopping experience is that they are taking on the risk and accountability for the value of the data as it is introduced into analysis, decision-making, and automation.
Since when did data management and data governance become interchangeable?
This is a question that has both confounded and frustrated me. The pursuit of data management vendors to connect with business stakeholders, because of the increasing role business units have had in decison making and holding the purse strings to technology purchases, means data governance as a term was hijacked to snuff out the bad taste of IT data projects gone sour.
The funny thing is, vendors actually began drinking their own marketing Kool-aid and think of their MDM, quality, security, and lifecycle management products as data governance tools/solutions. Storage and virtualizations vendors are even starting to grock on to this claiming they govern data. Big data vendors jumped over data management altogether and just call their catalogs, security, and lineage capabilities data governance.
Yes, this is a pet peeve of mine - just as data integration is now called blending, and data cleansing and transformation is now called wrangling or data preparation. But more on that is another blog...
First, you (vendor or data professional) cannot simply sweep the history of legacy data investments that were limited in results and painful to implement under the MadMen carpet. Own it and address the challenges through technology innovation rather than words.
You’ve heard it before but we said it again – this time in our recent webinar. There's a new kid in town: the chief data officer. Why the new role? Because of an increasing awareness of the value of data and the painful recognition of an inability to take advantage of the opportunities that it provides — due to technology, business, or basic cultural barriers. That was the topic of our webinar presented to a full house a few days ago; we discussed our recent report, Top Performers Appoint Chief Data Officers. Fortunately for those who weren’t there, the presentation – Chief Data Officers Cross The Chasm – is available (to clients) for download.
As the title suggests, chief data officers are no longer just for the early adopters – those enthusiasts and visionaries on the forefront of new technology trends. With 45% of global companies having appointed a chief data officer (not to be confused with a chief digital officer, as we specifically asked about “data”) and another 16% planning to make an appointment in the next 12 months – according to Forrester's Business Technographics surveys, the role of the chief data officer really has move into the mainstream.
However, there remain many companies who are not sure of whether they need a CDO or not. Many of those in our audience fell into that category. We asked two questions of the audience to gauge their interest and their actions to improve their data maturity:
Are you making organizational changes specifically to improve your data capabilities?
Gene Leganza and I just published a report on the role of the Chief Data Officer that we’re hearing so much about these days – Top Performers Appoint Chief Data Officers. To introduce the report, we sat down with our press team at Forrester to talk about the findings, and the implications for our clients.
Forrester PR: There's a ton of fantastic data in the report around the CDO. If you had to call out the most surprising finding, what would top your list?
Gene: No question it's the high correlation between high-performing companies and those with CDOs. Jennifer and I both feel that strong data capabilities are critical for organizations today and that the data agenda is quite complex and in need of strong leadership. That all means that it's quite logical to expect a correlation between strong data leadership and company performance - but given the relative newness of the CDO role it was surprising to see firm performance so closely linked to the role.
Of course, you can't infer cause and effect from correlation – the data could mean that execs in high-performing companies think having a CDO role is a good idea as much as it could mean CDOs are materially contributing to high performance. Either way that single statistic should make one take a serious look at the role in organizations without clear data leadership.
When I think about data, I can't help but think about hockey. As a passionate hockey mom, it's hard to separate my conversations about data all week with clients from the practices and games I sit through, screaming encouragement to my son and his team (sometimes to the embarrassment of my husband!). So when I recently saw a documentary on the building of the Russian hockey team that our miracle US hockey team beat at the 1980 Olympics, the story of Anatoli Tarsov stuck with me.
Before the 1960s, Russia didn't have a hockey team. Then the Communist party determined that it was critical that Russia build one — and compete on the world stage. They selected Anatoli Tarsov to build the team and coach. He couldn't see films on hockey. He couldn't watch teams play. There was no reference on how to play the game. And yet, he built a world-class hockey club that not only beat the great Nordic teams but went on to crush the Canadian teams that were the standard for hockey excellence.
This is a lesson for us all when it comes to data. Do we stick with our standards and recipes from Inmon and Kimball? Do we follow check-box assessments from CMMI, DM-BOK, or TOGAF's information architecture framework? Do we rely on governance compliance to police our data?
Or do we break the rules and create our own that are based on outcomes and results? This might be the scarier path. This might be the riskier path. But do you want data to be where your business needs it, or do you want to predefine, constrain, and bias the insight?
It is easy to get ahead of ourselves with all the innovation happening with data and analytics. I wouldn't call it hype, as that would imply no value or competency has been achieved. But I would say that what is bright, shiny, and new is always more interesting than the ordinary.
And, to be frank, there is still a lot of ordinary in our data management world.
In fact, over the past couple of weeks, discussions with companies have uncommonly focused on the ordinary. This in some ways appeared to be unusual because questions focused on the basic foundational aspects of data management and governance — and for companies that I have seen talk publicly about their data management successes.
"Where do I clean the data?"
"How do I get the business to invest in data?"
"How do I get a single customer view of my customer for marketing?"
What this tells me is that companies are under siege by zombie data.
Data is living in our business under outdated data policies and rules. Data processes and systems are persisting single-purpose data. As data pros turn over application rocks and navigate through the database bogs to centralize data for analytics and virtualize views for new data capabilities, zombie data is lurching out to consume more of the environment, blocking other potential insight to keep the status quo.
The questions you and your data professional cohorts are asking, as illustrated above, are anything but basic. The fact that these foundational building blocks have to be assessed once again demonstrates that organizations are on a path to crush the zombie data siege, democratize data and insight, and advance the business.
Keep asking basic questions — if you aren't, zombie data will eventually take over, and you and your organization will become part of the walking dead.
Big data and Hadoop (Yellow Elephants) are so synonymous that you can easily overlook the vast landscape of architecture that goes into delivering on big data value. Data scientists (Pink Unicorns) are also raised to god status as the only real role that can harness the power of big data -- making insights obtainable from big data as far away as a manned journey to Mars. However, this week, as I participated at the DGIQ conference in San Diego and colleagues and friends attended the Hadoop Summit in Belgium, it has become apparent that organizations are waking up to the fact that there is more to big data than a "cool" playground for the privileged few.
The perspective that the insight supply chain is the driver and catalyst of actions from big data is starting to take hold. Capital One, for example, illustrated that if insights from analytics and data from Hadoop were going to influence operational decisions and actions, you need the same degree of governance as you established in traditional systems. A conversation with Amit Satoor of SAP Global Marketing talked about a performance apparel company linking big data to operational and transactional systems at the edge of customer engagement and that it had to be easy for application developers to implement.
Hadoop distribution, NoSQL, and analytic vendors need to step up the value proposition to be more than where the data sits and how sophisticated you can get with the analytics. In the end, if you can't govern quality, security, and privacy for the scale of edge end user and customer engagement scenarios, those efforts to migrate data to Hadoop and the investment in analytic tools cost more than dollars; they cost you your business.