Big data and Hadoop (Yellow Elephants) are so synonymous that you can easily overlook the vast landscape of architecture that goes into delivering on big data value. Data scientists (Pink Unicorns) are also raised to god status as the only real role that can harness the power of big data -- making insights obtainable from big data as far away as a manned journey to Mars. However, this week, as I participated at the DGIQ conference in San Diego and colleagues and friends attended the Hadoop Summit in Belgium, it has become apparent that organizations are waking up to the fact that there is more to big data than a "cool" playground for the privileged few.
The perspective that the insight supply chain is the driver and catalyst of actions from big data is starting to take hold. Capital One, for example, illustrated that if insights from analytics and data from Hadoop were going to influence operational decisions and actions, you need the same degree of governance as you established in traditional systems. A conversation with Amit Satoor of SAP Global Marketing talked about a performance apparel company linking big data to operational and transactional systems at the edge of customer engagement and that it had to be easy for application developers to implement.
Hadoop distribution, NoSQL, and analytic vendors need to step up the value proposition to be more than where the data sits and how sophisticated you can get with the analytics. In the end, if you can't govern quality, security, and privacy for the scale of edge end user and customer engagement scenarios, those efforts to migrate data to Hadoop and the investment in analytic tools cost more than dollars; they cost you your business.
Often considered the poster child of digital transformation, APIs are proliferating at enterprises making industry-leading investments in mobile, IoT, and big data. As these initiatives mature, CIOs, CTOs, and heads of development are coming together with business leaders to manage and secure companywide use of APIs using API management solutions.
Forrester recently released a report that sizes and projects annual spending on API management solutions. We predict US companies alone will spend nearly $3 billion on API management over the next five years. Annual spend will quadruple by the end of the decade, from $140 million in 2014 to $660 million in 2020. International sales will take the global market over the billion dollar mark.
In interviewing vendors for this piece of research, we discovered a vast and fertile landscape of participants:
Startups have taken $430 million in venture funding, and so far have realized $335 million in acquisition value. In April 2015, pure-play vendor Apigee went IPO and currently trades at a valuation north of $400 million.
Beware of insights! Real danger lurks behind the promise of big data to bring more data to more people faster, better, and cheaper: Insights are only as good as how people interpret the information presented to them. When looking at a stock chart, you can't even answer the simplest question — "Is the latest stock price move good or bad for my portfolio?" — without understanding the context: where you are in your investment journey and whether you're looking to buy or sell. While structured data can provide some context — like checkboxes indicating your income range, investment experience, investment objectives, and risk tolerance levels — unstructured data sources contain several orders of magnitude more context. An email exchange with a financial advisor indicating your experience with a particular investment vehicle, news articles about the market segment heavily represented in your portfolio, and social media posts about companies in which you've invested or plan to invest can all generate much broader and deeper context to better inform your decision to buy or sell.
But defining the context by finding structures, patterns, and meaning in unstructured data is not a simple process. As a result, firms face a gap between data and insights; while they are awash in an abundance of customer and marketing data, they struggle to convert this data into the insights needed to win, serve, and retain customers. In general, Forrester has found that:
The problem is not a lack of data. Most companies have access to plenty of customer feedback surveys, contact center records, mobile tracking data, loyalty program activities, and social media feeds — but, alas, it's not easily available to business leaders to help them make decisions.
What’s the top imperative at your company? If it’s not a transformation to make the company more customer-focused, you’re making a mistake. Technology and economic forces have changed the world so much that an obsession with winning, serving, and retaining customers is the only possible response.
We’re in an era of persistent economic imbalances defined by erratic economic growth, deflationary fears, an oversupply of labor, and surplus capital hunting returns in a sea of record-low interest rates. This abundance of capital and labor means that the path from good idea to customer-ready product has never been easier, and seamless access to all of the off-the-shelf components needed for a startup fuels the rise of weightless companies, which further intensify competition.
Chastened by a weak economy, presented with copious options, and empowered with technology, consumers have more market muscle than ever before. The information advantage tips to consumers with ratings and review sites. They claim pricing power by showrooming. And the only location that matters is the mobile phone in their hand from which they can buy anything from anyone and have it delivered anywhere.
This customer-driven change is remaking every industry. Cable and satellite operators lost almost 400,000 video subscribers in 2013 and 2014 as customers dropped them for the likes of Netflix. Lending Club, an alternative to commercial banks, has facilitated more than $6 billion in peer-to-peer loans. Now that most B2B buyers would rather buy from a website than a salesperson, we estimate that 1 million B2B sales jobs will disappear in the coming years.
Recently, I talked with the CEO and founder of reBuy about the shifting dynamics in the retail sector as a result of digitalization. The use of data has evolved to the point where data has become the enterprise’s most critical business asset in the age of the customer. The business model of reBuy reCommerce — the leading German marketplace for secondhand goods — can help CIOs understand how the intelligent use of data can significantly disrupt a market such as retail.
The case of reBuy offers interesting insights into how the wider trends of the sharing and collaborative economy affect retail. If you can buy a good-quality used product with a guarantee for half the price, many people will not buy the product new. Many consumers increasingly accept product reuse and see it as an opportunity to obtain cheaper products and reduce their environmental footprint by avoiding the production of items that wouldn’t be used efficiently. The reBuy case study highlights that:
Business technology is taking the sharing economy into new realms. The reBuy business model demonstrates that consumers are starting to push the ideas of the sharing economy deep into the retail space. CIOs in all industries must prepare for the implications that this will have for their businesses.
Standalone products are at particular risk of sharing dynamics. The example of reBuy shows that businesses that sell plain products will come under even more pressure from shifting shopping behavior, where people are increasingly satisfied with buying used goods. These businesses need to add value to those products that are not available for secondhand purchase.
Intel has made no secret of its development of the Xeon D, an SOC product designed to take Xeon processing close to power levels and product niches currently occupied by its lower-power and lower performance Atom line, and where emerging competition from ARM is more viable.
The new Xeon D-1500 is clear evidence that Intel “gets it” as far as platforms for hyperscale computing and other throughput per Watt and density-sensitive workloads, both in the enterprise and in the cloud are concerned. The D1500 breaks new ground in several areas:
It is the first Xeon SOC, combining 4 or 8 Xeon cores with embedded I/O including SATA, PCIe and multiple 10 nd 1 Gb Ethernet ports.
It is the first of Intel’s 14 nm server chips expected to be introduced this year. This expected process shrink will also deliver a further performance and performance per Watt across the entire line of entry through mid-range server parts this year.
Why is this significant?
With the D-1500, Intel effectively draws a very deep line in the sand for emerging ARM technology as well as for AMD. The D1500, with 20W – 45W power, delivers the lower end of Xeon performance at power and density levels previously associated with Atom, and close enough to what is expected from the newer generation of higher performance ARM chips to once again call into question the viability of ARM on a pure performance and efficiency basis. While ARM implementations with embedded accelerators such as DSPs may still be attractive in selected workloads, the availability of a mainstream x86 option at these power levels may blunt the pace of ARM design wins both for general-purpose servers as well as embedded designs, notably for storage systems.
Open data is critical for delivering contextual value to customers in digital ecosystems. For instance, The Weather Channel and OpenWeatherMap collect weather-related data points from millions of data sources, including the wingtips of aircraft. They could share these data points with car insurance companies. This would allow the insurers to expand their customer journey activities, such as alerting their customers in real time to warn them of an approaching hailstorm so that the car owners have a chance to move their cars to safety. Success requires making logical connections between isolated data fields to generate meaningful business intelligence.
But also trust is critical to deliver value in digital ecosystems. One of the key questions for big data is who owns the data. Is it the division that collects the data, the business as a whole, or the customer whose data is collected? Forrester believes that for data analytics to unfold its true potential and gain end user acceptance, the users themselves must remain the ultimate owner of their own data.
The development of control mechanisms that allow end users to control their data is a major task for CIOs. One possible approach could be dashboard portals that allow end users to specify which businesses can use which data sets and for what purpose. Private.me is trying to develop such a mechanism. It provides servers to which individual's information is distributed to be run by non-profit organizations. Data anonymization is another approach that many businesses are working on, despite the fact that there are limits to data anonymization as a means to ensure true privacy.
The business has an insatiable appetite for data and insights. Even in the age of big data, the number one issue of business stakeholders and analysts is getting access to the data. If access is achieved, the next step is "wrangling" the data into a usable data set for analysis. The term "wrangling" itself creates a nervous twitch, unless you enjoy the rodeo. But, the goal of the business isn't to be an adrenalin junky. The goal is to get insight that helps them smartly navigate through increasingly complex business landscapes and customer interactions. Those that get this have introduced a softer term, "blending." Another term dreamed up by data vendor marketers to avoid the dreaded conversation of data integration and data governance.
The reality is that you can't market message your way out of the fundamental problem that big data is creating data swamps even in the best intentioned efforts. (This is the reality of big data's first principle of a schema-less data.) Data governance for big data is primarily relegated to cataloging data and its lineage which serve the data management team but creates a new kind of nightmare for analysts and data scientist - working with a card catalog that will rival the Library of Congress. Dropping a self-service business intelligence tool or advanced analytic solution doesn't solve the problem of familiarizing the analyst with the data. Analysts will still spend up to 80% of their time just trying to create the data set to draw insights.
I’m ramping up to attend Strata in San Jose, February 18, 19 and 20th. Here is some info to help everyone who wants to connect and share thoughts. Looking forward to great sessions and a lot of thought leadership.
I’ll be setting aside some time for 1:1 meetings (Booked Full)
[Updated on 2/17] - I have set up some blocks of time to meet with people at Strata. Please follow the link below to schedule with me on a first come basis.
[Update] - I booked out inside 2 hours...didn't expect that! I may open up my calendar for more meetings but need to get a better bead on the sessions I want to attend first. Shoot to catch me at breakfast, will tweet out when I'm there.
I’ll be posting my thoughts and locations on Twitter
The best way to connect with me at Strata is to follow me on Twitter @practicingea.
You can post @ me or DM me. I’ll be posting my location and you can drop by for ad hoc conversations as well.
I’m very interested in your point of view - data driven to insights driven
I am concluding very quickly that “big data” as we have viewed it for the last five years is not enough. I see firms using words like “real-time” or “right-time” or “fast data” to suggest the need is much bigger than big data – its about connecting data to action in a continuous learning loop.
The battle of trying to apply traditional waterfall software development life-cycle (SDLC) methodology and project management to BI has already been fought — and largely lost. These approaches and best practices, which apply to most other enterprise applications, work well in some cases, as with very well-defined and stable BI capabilities like tax or regulatory reporting. Mission-critical, enterprise-grade BI apps can also have a reasonably long shelf life of a year or more. But these best practices do not work for the majority (anecdotally, about three-quarters) of BI initiatives, where requirements change much faster than these traditional approaches can support; by the time a traditional BI application development team rolls out what it thought was a well-designed BI application, it's too late. As a result, BI pros need to move beyond earlier-generation BI support organizations to:
Focus on business outcomes, not just technologies. Earlier-generation BI programs lacked an "outcomes first" mentality. Those programs employed bottom-up approaches that focused on the project management and technology first, leaving clients without the proper outcomes that they needed to manage the business; in other words, they created an insights-to-action gap. BI pros should use a top-down approach that defines key performance indicators, metrics, and measures that support the business' goals and objectives. They must resist the temptation to address technology and data needs before the business requirements.