When you hear the term fast data the first thought is probably the velocity of the data. Not unusual in the realm of big data where velocity is one of the V's everyone talked about. However, fast data encompasses more than a data characteristic, it is about how quickly you can get and use insight.
Working with Noel Yuhanna on an upcoming report on how to develop your data management roadmap, we found speed was a continuous theme to achieve. Clients consistently call out speed as what holds them back. How they interpret what speed means is the crux of the issue.
Technology management thinks about how quickly data is provisioned. The solution is a faster engine - in-memory grids like SAP HANA become the tool of choice. This is the wrong way to think about it. Simply serving up data with faster integration and a high performance platform is what we have always done - better box, better integration software, better data warehouse. Why use the same solution that in a year or two runs against the same wall?
The other side of the equation is that sending data out faster ignores what business stakeholders and analytics teams want. Speed to the business encompasses self-service data acquisition, faster deployment of data services, and faster changes. The reason, they need to act on the data and insights.
The right strategy is to create a vision that orients toward business outcomes. Today's reality is that we live in a world where it is no longer about first to market, we have to be about first to value. First to value with our customers, and first to value with our business capabilities. The speed at which insights are gained and ultimately how they are put to use is your data management strategy.
At the China Hadoop Summit 2015 in Beijing this past weekend, I talked with various big data players, including large consumers of big data China Unicom, Baidu.com, JD.com, and Ctrip.com; Hadoop platform solution providers Hortonworks, RedHadoop, BeagleData, and Transwarp; infrastructure software vendors like Sequotia.com; and Agile BI software vendors like Yonghong Tech.
The summit was well-attended — organizers planned for 1,000 attendees and double that number attended — and from the presentations and conversations it’s clear that big data ecosystems are making substantial progress. Here are some of my key takeaways:
Telcos are focusing on optimizing internal operations with big data.Take China Unicom, one of China’s three major telcos, for example. China Unicom has completed a comprehensive business scenario analysis of related data across each segment of internal business operations, including business and operations support systems, Internet data centers, and networks (fixed, mobile, and broadband). It has built a Hadoop-based big data platform to process trillions of mobile access records every day within the mobile network to provide practical guidelines and progress monitoring on the construction of base stations.
The Information Governance report is out! Over the last few months I've test-driven my thoughts and data with many vendors and enterprise customers via conference presentations, webinars, and one-on-one discussions. For those who have shared their thoughts with me - thank you! Forrester subscribers can access it here.
Information Governance is red hot right now, and it was time for Forrester contribute a view on how IG helps companies meet their core corporate missions. Of course, better, consistent fulfillment of compliance obligations is essential, but so are objectives such as customer service, revenue growth, and improved agility in oh-so competitive markets. IG is not just about getting rid of junk content, it is - more importantly - about instilling trust in the data and communication we use to run our businesses.
Software is not a silver bullet for information governance. Look beyond vendor hype - IG is not something to go buy so you can say your company has it. Look at IG as an evergreen corporate objective, enabled by programs, policies, people- and yes, a range of technologies.
I'm going to add to the IG definition war this week, by describing information governance as:
Since the dawn of big data data quality and data governance professionals are yelling on rooftops about the impact of dirty data. Data scientists are equally yelling back that good enough data is the new reality. Data trust at has turned relative.
Consider these data points from recent Forrester Business Technographics Survey on Data and Analytics and our Online Global Survey on Data Quality and Trust:
Nearly 9 out of 10 data professionals rate data quality as a very important or important aspect of information governance
43% of business and technology management professionals are somewhat confident in their data, and 25% are concerned
The cloud market in China is changing fast. The official launch of the commercial operations of Microsoft Azure (Azure) earlier this year started a new chapter (as detailed in my March blog post), while last weekend’s Amazon Web Services (AWS) summit was held in China for the first time and announced the third episode of this war. AWS is speeding up building its ecosystem and starting to challenge both Microsoft’s early-mover advantage and the market share of other global and local players.
To help CIOs and enterprise architects set up their hybrid cloud strategy in the region, we’ve put together a brief comparison of the Azure and AWS offerings and ecosystems in China:
Operations.Microsoft made Azure available for preview in China on June 6, 2013 and announced its commercial launch on March 25, 2014, stating that it would be operated by 21ViaNet and have a service-level agreement (SLA) of 99.95%. It has two dedicated data centers in Beijing and Shanghai. AWS announced the availability of its “Beijing region” in China on December 18, 2013, but it still hasn’t announced its official commercial launch, other than a partnership with Cloud Valley. Currently, AWS has only one data center in Ningxia province.
Offerings.Azure offerings cover services for compute (VM, websites, cloud services, etc.); data (storage, SQL database, HDInsight, backup, etc.); applications (service bus, Active Directory, CDN, media services, notification services, etc.); and networking (virtual network, Traffic Manager, etc.). Azure also provides other solutions, such as infrastructure services, data management, and application development and deployment.
The rise of the DevOps role in the enterprise and the increasing requirements of agility beyond infrastructure and applications make the platform-as-a-service (PaaS) market one to watch for both CIOs and enterprise architecture professionals. On December 9, the membership of Cloud Foundry, a major PaaS open source project, announced the formation of the Cloud Foundry Foundation.
In my view, this is as important as the establishment of OpenStack foundation in 2012, which was a game-changing move for the cloud industry. Here’s why:
PaaS is becoming an important alternative to middleware stacks. Forrester defines PaaS as a complete application platform for multitenant cloud environments that includes development tools, runtime, and administration and management tools and services. (See our Forrester Wave evaluation for more detail on the space and its vendors.) In the cloud era, it’s a transformational alternative to established middleware stacks for the development, deployment, and administration of custom applications in a modern application platform, serving as a strategic layer between infrastructure-as-a-service (IaaS) and software-as-a-service (SaaS) with innovative tools.
Cloud Foundry is one major open source PaaS software. Cloud Foundry as a technology was designed and architected by Derek Collison and built in the Ruby and Go programming languages by Derek and Vadim Spivak (wiki is wrong!). VMware released it as open source in 2011 after Derek joined the company. Early adopters of Cloud Foundry include large multinationals like Verizon, SAP, NTT, and SAS, as well as Chinese Internet giants like Baidu.
By now you have at least seen the cute little elephant logo or you may have spent serious time with the basic components of Hadoop like HDFS, MapReduce, Hive, Pig and most recently YARN. But do you have a handle on Kafka, Rhino, Sentry, Impala, Oozie, Spark, Storm, Tez… Giraph? Do you need a Zookeeper? Apache has one of those too! For example, the latest version of Hortonworks Data Platform has over 20 Apache packages and reflects the chaos of the open source ecosystem. Cloudera, MapR, Pivotal, Microsoft and IBM all have their own products and open source additions while supporting various combinations of the Apache projects.
After hearing the confusion between Spark and Hadoop one too many times, I was inspired to write a report, The Hadoop Ecosystem Overview, Q4 2104. For those that have day jobs that don’t include constantly tracking Hadoop evolution, I dove in and worked with Hadoop vendors and trusted consultants to create a framework. We divided the complex Hadoop ecosystem into a core set of tools that all work closely with data stored in Hadoop File System and extended group of components that leverage but do not require it.
In the past, enterprise architects could afford to think big picture and that meant treating Hadoop as a single package of tools. Not any more – you need to understand the details to keep up in the age of the customer. Use our framework to help, but please read the report if you can as I include a lot more there.
Early this year a host of inquires were coming in about data quality challenges in CRM systems. This led to a number of joint inquires between myself and CRM expert Kate Legget, VP and Principal Analyst in our application development and delivery team. Seems that the expectations that CRM systems could provide a single trusted view of the customer was starting to hit a reality check. There is more to collecting customer data and activities, you need validation, cleansing, standardization, consolidation, enrichment and hierarchies. CRM applications only get you so far, even with more and more functionality being added to reduce duplicate records and enforce classifications and groups. So, what should companies do?
In 2014, the top priorities for business process management (BPM) initiatives focused on extending mission critical business processes to support the mobile workforce and redesigning business processes to deliver exceptional customer experiences. During 2014, Forrester also noticed a growing appetite to move business critical processes into the cloud using BPM platform-as-a-service solutions. And, although customer sentiment for BPM was mixed to negative in 2014, software vendors reported respectable double-digit revenue growth for BPM solutions. Sounds like it’s time to pop the bubbly and celebrate, right?
Not quite yet. In 2015, BPM will fight to expand its relevance in the front office and will need to shed serious weight to better align with age of the customer imperatives that prioritize speed-to-market over analysis and complexity – traditional hallmarks of the BPM discipline and software solutions. Together, with my colleague Craig Le Clair, we expect 2015 to be a tipping point for the BPM market. In 2015, customer-obsession – the relentless focus on winning, retaining, and serving customers – will disrupt and reshape the entire ecosystem for BPM:
The data economy — or the system that provides for the exchange of digitized information for the purpose of creating insights and value — grew in 2014, but in 2015 we’ll see it leap forward significantly. It will grow from a phenomenon that mainstream enterprises view at arm’s length as interesting to one that they embrace as a part of business as usual. The number of business and technology leaders telling us that external data is important to their business strategy has been growing rapidly -- from one-third in 2012 to almost half in 2014.
Why? It’s a supply-driven phenomenon made possible by widespread digitization, mobile technology, the Internet of Things (IoT), and Hadooponomics. With countless new data sources and powerful new tools to wrest insights from their depths, organizations will scramble to use them to know their customers better and to optimize their operations beyond anything they could have done before. And while the exploding data supply will spur demand, it will also spur additional supply. Firms will be taking a hard look at their “data exhaust” and wondering if there is a market for new products and services based on their unique set of data. But in many cases, the value in the data is not that people will be willing to pay money for bulk downloads or access to raw data, but in data products that complement a firm’s existing offerings.