Data management is becoming critical as organizations seek to better understand and target their customers, drive out inefficiency, and satisfy government regulations. Despite this, the maturity of data management practices at companies in China is generally poor.
I had an enlightening conversation with my colleague, senior analyst Michele Goetz, who covers all aspects of data management. She told me that in North America and Europe, data management maturity varies widely from company to company; only about 5% have mature practices and a robust data management infrastructure. Most organizations are still struggling to be agile and lack measurement, even if they already have data management platforms in place. Very few of them align adequately with their specific business or information strategy and organizational structure.
If we look at data management maturity in China, I suspect the results are even worse: that fewer than 1% of the companies are mature in terms of integrated strategy, agile execution and continuous performance measurement. Specifically:
The practice of data management is still in the early stages. Data management is not only about simply deploying technology like data warehousing or related middleware, but also means putting in place the strategy and architectural practice, including contextual services and metadata pattern modeling, to align with business focus. The current focus of Chinese enterprises for data management is mostly around data warehousing, master data management, and basic support for both end-to-end business processes and composite applications for top management decision-making. It’s still far from leveraging the valuable data in business processes and business analytics.
William Shakespeare wrote that “What’s past is prologue.” Big data surely builds on our rich past of using data to understand our world, our customers, and ourselves. Now the world is flush and getting flusher in big data from cloud, mobile, and the Internet of things. What does it mean for enterprises? In a word: opportunity. Firms have taken to big data. Here are my four predictions for key enterprise big data themes in 2013:
Firms will realize that “big data” means all of their data. Big data is the frontier of a firm’s ability to store, process, and access (SPA) all of the data it needs to operate effectively, make decisions, reduce risks, and create better customer experiences. The key word in the definition of big data is frontier. Many think that big data is only about data stored in Hadoop. Not true. Big data is not defined by how it is stored. It can and will continue to reside in all kinds of data architectures, including enterprise data warehouses, application databases, file systems, cloud storage, Hadoop, and others. By the way, some predict the end of the data warehouse — but that’s nonsense. If anything, all forms of data technology will evolve and be necessary to handle the frontier of big data. In 2013, all data is big data.
Rowan Curran, Research Associate and TechnoPolitics producer, hosts this episode to ask me (your regular host) about The Pragmatic Definition Of Big Data. Listen (5 mins) to hear the genesis of this new definition of big data and why it is pragmatic and actionable for both business and IT professionals.
Podcast: The Pragmatic Definition Of Big Data Explained (5 mins)
It looks that EMC has finally admitted it needs a better approach for courting developers and is doing something significant to fix this. No longer will key assets like Greenplum, Pivotal, or Spring flounder in a corporate culture dominated by infrastructure thinking and selling.
After months of rumors about a possible spin-out going unaddressed, EMC pulled the trigger today, asking Terry Anderson, its VP of Corporate Communications, to put out an official acknowledgement on one of it its blogs (a stealthy, investor-relations-centric move) of its plans to aggregate its cloud and big data assets and give them concentrated focus. It didn't officially announce a spin out or even the creation of a new division. Nor did it clearly identify the role former VMware CEO Paul Maritz will play in this new gathering. But it did clarify what assets would be pushed into this new group:
Big data is driving disruptive change across the economy in business such as healthcare, retail, communications, and entertainment. The potential for firms to use big data to create permanent relationships with customers is huge, and the time to get onboard is now. Big data is driving disruptive change across the economy in business such as healthcare, retail, communications, and entertainment. The potential for firms to use big data to create permanent relationships with customers is huge, and the time to get onboard is now. I was thrilled to be featured in the first episode on a new series, Big Thinkers In Big Data, hosted by TWit network's Sarah
In the face of rising data volume and complexity and increased need for self-service, enterprises need an effective business intelligence (BI) reference architecture to utilize BI as a key corporate asset for competitive differentiation. BI stakeholders — such as project managers, developers, data architects, enterprise architects, database administrators, and data quality specialists — may find the myriad choices and constant influx of new business requirements overwhelming. Forrester's BI reference architecture provides a framework with architectural patterns and building blocks to guide these BI stakeholders in managing BI strategy and architecture.
Enterprise information management (EIM) is complex — from a technical, organizational, and operational standpoint. But to business users, all that complexity is behind the scenes. What they need is BI, an interface to enterprise data — whether it's structured, semistructured, or unstructured. Our June 2011 Global Technology Trends Online Survey showed that BI topped even mobility — the frontrunner in recent years — as the technology most likely to provide business value over the next three years.
A key role of IT operations is to keep a complex portfolio of applications running and performing. "Traditional monitoring dashboards generate lots of pretty charts and graphs but don't really tell IT operations professionals a whole lot," says Forrester Principal Analyst Glenn O'Donnell. Big data analytics will change that because sophisticated algorithms can "look for the little tremors that tell us something big is about to happen."
High Availability And Performance Are Top Goals For IT Ops
Asked what 5 nines (99.999%) of availability means, Glenn replies immediately, "5 nines of availability is 26 seconds of downtime per month." He adds "If you want to capture just one 26 second event, you have to be polling every 13 seconds." Glenn knows his stuff. Listen to find out from Glenn how big data has a big place in the future of IT operations.
There was lots of feedback on the last blog (“Risk Data, Risky Business?”) that clearly indicates the divide between definitions in trust and quality. It is a great jumping off point for the next hot topic, data governance for big data.
The comment I hear most from clients, particularly when discussing big data, is, “Data governance inhibits agility.” Why be hindered by committees and bureaucracy when you want freedom to experiment and discover?
Current thinking: Data governance is freedom from risk.The stakes are high when it comes to data-intensive projects, and having the right alignment between IT and the business is crucial. Data governance has been the gold standard to establish the right roles, responsibilities, processes, and procedures to deliver trusted secure data. Success has been achieved through legislative means by enacting policies and procedures that reduce risk to the business from bad data and bad data management project implementation. Data governance was meant to keep bad things from happening.
Today’s data governance approach is important and certainly has a place in the new world of big data. When data enters the inner sanctum of an organization, management needs to be rigorous.
Yet, the challenge is that legislative data governance by nature is focused on risk avoidance. Often this model is still IT led. This holds progress back as the business may be at the table, but it isn’t bought in. This is evidenced by committee and project management style data governance programs focused on ownership, scope, and timelines. All this management and process takes time and stifles experimentation and growth.
Every year the Center For Digital Strategies at Tuck chooses a technology topic to "provide MBA candidates and the Tuck and Darthmouth communities with insights into how changes in technology affect individuals, impact enterprises and reshape industries." This academic year the topic is "Big Data: The Information Explosion That Will Reshape Our World". I had the honor and privilege to kick off the series about big data at the Tuck School of Business at Dartmouth. I am thrilled that our future business leaders are considering how big data can help companies, communities, and government make smarter decisions and provide better customer experiences. The combination of big data and predictive analytics is already changing the world. Below is the edited video of my talk on big data predictive analytics at Tuck in Hanover, NH.
There's certainly a lot of hype out there about big data. As I previously wrote, some of it is indeed hype, but there are still many legitimate big data cases - I saw a great example during my last business trip. Hadoop certainly plays a key role in the big data revolution, so all business intelligence (BI) vendors are jumping on the bandwagon and saying that they integrate with Hadoop. But what does that really mean? First of all, Hadoop is not a single entity; it's a conglomeration of multiple projects, each addressing a certain niche within the Hadoop ecosystem, such as data access, data integration, DBMS, system management, reporting, analytics, data exploration, and much much more. To lift the veil of hype, I recommend that you ask your BI vendors the following questions
Which specific Hadoop projects do you integrate with (HDFS, Hive, HBase, Pig, Sqoop, and many others)?
Do you work with the community edition software or with commercial distributions from MapR, EMC/Greenplum, Hortonworks, or Cloudera? Have these vendors certified your Hadoop implementations?
Do you have tools, utilities to help the client data into Hadoop in the first place (see comment from Birst)?
Are you querying Hadoop data directly from your BI tools (reports, dashboards) or are you ingesting Hadoop data into your own DBMS? If the latter:
Are you selecting Hadoop result sets using Hive?
Are you ingesting Hadoop data using Sqoop?
Is your ETL generating and pushing down Map Reduce jobs to Hadoop? Are you generating Pig scripts?