A frequent question I get from data management and governance teams is how to stay ahead of or on top of the Agile development process that app dev pros swear by. New capabilities are spinning out faster and faster, with little adherence to ensuring compliance with data standards and policies.
Well, if you can't beat them, join them . . . and that's what your data management pros are doing, jumping into Agile development for data.
Forrester's survey of 118 organizations shows that just a little over half of organizations have implemented Agile development in some manner, shape, or form to deliver on data capabilities. While they lag about one to two years behind app dev's adoption, the results are already beginning to show in terms of getting a better handle on their design and architectural decisions, improved data management collaboration, and better alignment of developer skills to tasks at hand.
But we have a long way to go. The first reason to adopt Agile development is to speed up the release of data capabilities. And the problem is, Agile development is adopted to speed up the release of data capabilities. In the interest of speed, the key value of Agile development is quality. So, while data management is getting it done, they may be sacrificing the value new capabilities are bringing to the business.
Let's take an example. Where Agile makes sense to start is where teams can quickly spin up data models and integration points in support of analytics. Unfortunately, this capability delivery may be restricted to a small group of analysts that need access to data. Score "1" for moving a request off the list, score "0" for scaling insights widely to where action will be taking quickly.
Since when did data management and data governance become interchangeable?
This is a question that has both confounded and frustrated me. The pursuit of data management vendors to connect with business stakeholders, because of the increasing role business units have had in decison making and holding the purse strings to technology purchases, means data governance as a term was hijacked to snuff out the bad taste of IT data projects gone sour.
The funny thing is, vendors actually began drinking their own marketing Kool-aid and think of their MDM, quality, security, and lifecycle management products as data governance tools/solutions. Storage and virtualizations vendors are even starting to grock on to this claiming they govern data. Big data vendors jumped over data management altogether and just call their catalogs, security, and lineage capabilities data governance.
Yes, this is a pet peeve of mine - just as data integration is now called blending, and data cleansing and transformation is now called wrangling or data preparation. But more on that is another blog...
First, you (vendor or data professional) cannot simply sweep the history of legacy data investments that were limited in results and painful to implement under the MadMen carpet. Own it and address the challenges through technology innovation rather than words.
You’ve heard it before but we said it again – this time in our recent webinar. There's a new kid in town: the chief data officer. Why the new role? Because of an increasing awareness of the value of data and the painful recognition of an inability to take advantage of the opportunities that it provides — due to technology, business, or basic cultural barriers. That was the topic of our webinar presented to a full house a few days ago; we discussed our recent report, Top Performers Appoint Chief Data Officers. Fortunately for those who weren’t there, the presentation – Chief Data Officers Cross The Chasm – is available (to clients) for download.
As the title suggests, chief data officers are no longer just for the early adopters – those enthusiasts and visionaries on the forefront of new technology trends. With 45% of global companies having appointed a chief data officer (not to be confused with a chief digital officer, as we specifically asked about “data”) and another 16% planning to make an appointment in the next 12 months – according to Forrester's Business Technographics surveys, the role of the chief data officer really has move into the mainstream.
However, there remain many companies who are not sure of whether they need a CDO or not. Many of those in our audience fell into that category. We asked two questions of the audience to gauge their interest and their actions to improve their data maturity:
Are you making organizational changes specifically to improve your data capabilities?
Gene Leganza and I just published a report on the role of the Chief Data Officer that we’re hearing so much about these days – Top Performers Appoint Chief Data Officers. To introduce the report, we sat down with our press team at Forrester to talk about the findings, and the implications for our clients.
Forrester PR: There's a ton of fantastic data in the report around the CDO. If you had to call out the most surprising finding, what would top your list?
Gene: No question it's the high correlation between high-performing companies and those with CDOs. Jennifer and I both feel that strong data capabilities are critical for organizations today and that the data agenda is quite complex and in need of strong leadership. That all means that it's quite logical to expect a correlation between strong data leadership and company performance - but given the relative newness of the CDO role it was surprising to see firm performance so closely linked to the role.
Of course, you can't infer cause and effect from correlation – the data could mean that execs in high-performing companies think having a CDO role is a good idea as much as it could mean CDOs are materially contributing to high performance. Either way that single statistic should make one take a serious look at the role in organizations without clear data leadership.
When I think about data, I can't help but think about hockey. As a passionate hockey mom, it's hard to separate my conversations about data all week with clients from the practices and games I sit through, screaming encouragement to my son and his team (sometimes to the embarrassment of my husband!). So when I recently saw a documentary on the building of the Russian hockey team that our miracle US hockey team beat at the 1980 Olympics, the story of Anatoli Tarsov stuck with me.
Before the 1960s, Russia didn't have a hockey team. Then the Communist party determined that it was critical that Russia build one — and compete on the world stage. They selected Anatoli Tarsov to build the team and coach. He couldn't see films on hockey. He couldn't watch teams play. There was no reference on how to play the game. And yet, he built a world-class hockey club that not only beat the great Nordic teams but went on to crush the Canadian teams that were the standard for hockey excellence.
This is a lesson for us all when it comes to data. Do we stick with our standards and recipes from Inmon and Kimball? Do we follow check-box assessments from CMMI, DM-BOK, or TOGAF's information architecture framework? Do we rely on governance compliance to police our data?
Or do we break the rules and create our own that are based on outcomes and results? This might be the scarier path. This might be the riskier path. But do you want data to be where your business needs it, or do you want to predefine, constrain, and bias the insight?
It is easy to get ahead of ourselves with all the innovation happening with data and analytics. I wouldn't call it hype, as that would imply no value or competency has been achieved. But I would say that what is bright, shiny, and new is always more interesting than the ordinary.
And, to be frank, there is still a lot of ordinary in our data management world.
In fact, over the past couple of weeks, discussions with companies have uncommonly focused on the ordinary. This in some ways appeared to be unusual because questions focused on the basic foundational aspects of data management and governance — and for companies that I have seen talk publicly about their data management successes.
"Where do I clean the data?"
"How do I get the business to invest in data?"
"How do I get a single customer view of my customer for marketing?"
What this tells me is that companies are under siege by zombie data.
Data is living in our business under outdated data policies and rules. Data processes and systems are persisting single-purpose data. As data pros turn over application rocks and navigate through the database bogs to centralize data for analytics and virtualize views for new data capabilities, zombie data is lurching out to consume more of the environment, blocking other potential insight to keep the status quo.
The questions you and your data professional cohorts are asking, as illustrated above, are anything but basic. The fact that these foundational building blocks have to be assessed once again demonstrates that organizations are on a path to crush the zombie data siege, democratize data and insight, and advance the business.
Keep asking basic questions — if you aren't, zombie data will eventually take over, and you and your organization will become part of the walking dead.
It’s not news that business user self-service for access to information and analytics is hot. What might not be as obvious is the overhaul of information-related roles that is happening now as a result. What’s driving this? The hunger for data (big, fast, and otherwise) to feed insights, very popular data visualization tools, and new but rapidly spreading technology that puts sophisticated data exploration and manipulation tools in the hands of business users.
One impact is that classic tech management functions such as data modeling and data integration are moving into business-side roles. I can’t help but be reminded of Bill Murray’s apocalyptic vision from “Ghostbusters:” “Dogs and cats, living together… mass hysteria!” Is this the end of rational, orderly data management as we know it? Haven’t central tech management organizations always seen business-side tech decision-making (and purchasing, and implementation) as “rogue” behavior that needed to be governed out of existence? If organizations have trouble now keeping data for analytics at the right level of quality in data warehouses, won’t all this introduction of new data sources and data lakes and whatnot just make things worse?
Well, my answers are “no,” “yes,” and “no” in that order. The big changes that are afoot are not the end of order and even though “business empowerment” translates to “rogue IT” in some circles, data lakes/hubs and the infusion of 3rd party data have actually been delivering on their promise of faster, better business insights for the organizations doing it right.
When you hear the term fast data the first thought is probably the velocity of the data. Not unusual in the realm of big data where velocity is one of the V's everyone talked about. However, fast data encompasses more than a data characteristic, it is about how quickly you can get and use insight.
Working with Noel Yuhanna on an upcoming report on how to develop your data management roadmap, we found speed was a continuous theme to achieve. Clients consistently call out speed as what holds them back. How they interpret what speed means is the crux of the issue.
Technology management thinks about how quickly data is provisioned. The solution is a faster engine - in-memory grids like SAP HANA become the tool of choice. This is the wrong way to think about it. Simply serving up data with faster integration and a high performance platform is what we have always done - better box, better integration software, better data warehouse. Why use the same solution that in a year or two runs against the same wall?
The other side of the equation is that sending data out faster ignores what business stakeholders and analytics teams want. Speed to the business encompasses self-service data acquisition, faster deployment of data services, and faster changes. The reason, they need to act on the data and insights.
The right strategy is to create a vision that orients toward business outcomes. Today's reality is that we live in a world where it is no longer about first to market, we have to be about first to value. First to value with our customers, and first to value with our business capabilities. The speed at which insights are gained and ultimately how they are put to use is your data management strategy.
By now you have at least seen the cute little elephant logo or you may have spent serious time with the basic components of Hadoop like HDFS, MapReduce, Hive, Pig and most recently YARN. But do you have a handle on Kafka, Rhino, Sentry, Impala, Oozie, Spark, Storm, Tez… Giraph? Do you need a Zookeeper? Apache has one of those too! For example, the latest version of Hortonworks Data Platform has over 20 Apache packages and reflects the chaos of the open source ecosystem. Cloudera, MapR, Pivotal, Microsoft and IBM all have their own products and open source additions while supporting various combinations of the Apache projects.
After hearing the confusion between Spark and Hadoop one too many times, I was inspired to write a report, The Hadoop Ecosystem Overview, Q4 2104. For those that have day jobs that don’t include constantly tracking Hadoop evolution, I dove in and worked with Hadoop vendors and trusted consultants to create a framework. We divided the complex Hadoop ecosystem into a core set of tools that all work closely with data stored in Hadoop File System and extended group of components that leverage but do not require it.
In the past, enterprise architects could afford to think big picture and that meant treating Hadoop as a single package of tools. Not any more – you need to understand the details to keep up in the age of the customer. Use our framework to help, but please read the report if you can as I include a lot more there.
When it comes to data technology, are you lost in translation? What's the difference between data federation, virtualization, and data or information-as-a-service? Are columnar databases also relational? Does one use the same or different tools for BAM (Business Activity Monitoring) and for CEP (Complex Event Processing)? These questions are just the tip of the iceberg of a plethora of terms and definitions in the rich and complex world of enterprise data and information. Enterprise application developers, data, and information architects manage multiple challenges on a daily basis already, and the last thing they need to deal with are misunderstandings of the various data technology component definitions.