One of the reasons for only a portion of enterprise and external (about a third of structured and a quarter of unstructured -) data being available for insights is a restrictive architecture of SQL databases. In SQL databases data and metadata (data models, aka schemas) are tightly bound and inseparable (aka early binding, schema on write). Changing the model often requires at best just rebuilding an index or an aggregate, at worst - reloading entire columns and tables. Therefore many analysts start their work from data sets based on these tightly bound models, where DBAs and data architects have already built business requirements (that may be outdated or incomplete) into the models. Thus the data delivered to the end-users already contains inherent biases, which are opaque to the user and can strongly influence their analysis. As part of the natural evolution of Business Intelligence (BI) platforms data exploration now addresses this challenge. How? BI pros can now take advantage of ALL raw data available in their enterprises by:
It’s been a while since I’ve blogged; not because I’ve had nothing to say, but rather because I’ve been busy with my colleagues Ted Schadler, James McCormick, and Holger Kisker working on a new line of research. We wanted to examine the fact that business satisfaction with analytics went down 21% between 2014 and 2015, despite big investments in big data. We found that while 74% of firms say they want to be “data-driven,” only 29% say they are good at connecting analytics to action. That is the problem.
Ted Schadler and I published some initial ideas around this idea inDigital Insights Are The New Currency Of Business in 2015. In that report, we started using the phrase digital insight to talk about what firms were really after ― action inspired by new knowledge. We saw that data and analytics were only means to that end. We also found that leading firms were turning data into insight and action by building systems of insight ― the business discipline and technology to harness insights and consistently turn data into action.
When I read articles like today's WSJ article on mutual funds exiting high tech startups and triangulate the content with Forrester client interactions over the last 12 to 18 months (and some rumors) I am now becoming convinced that there will be some Business Intelligence (BI) and analytics vendor shake ups in 2016. Even though according to our research enterprises are still only leveraging 20%-40% of their entire universe of data for insights and decisions, and 50%-80% of all BI/analytics apps are still done in spreadsheets, the market is over saturated with vendors. Just take a look at the 50+ vendors we track in our BI Vendor Landscape. IMHO we are nearing a saturating point where the buy side of the market cannot sustain so many sellers. Indeed we are already seeing a trend where large enterprises, which a couple of years ago had 10+ different BI platforms, today usually only deploy somewhere between 3 and 5. And, in case you missed it, we already saw what is surely to be a much bigger trend of BI/analytics M&A - SAP acquiring mobile BI vendor Roambi. Start hedging your BI vendor bets!
Rule #1. Don't just jump into creating a hefty enterprise wide Business Intelligence (BI)
Business intelligence and its next iteration, systems of insight (SOI), have moved to the top of BI pros' agendas for enterprise software adoption. Investment in BI tools and applications can have a number of drivers, both external (such as regulatory requirements or technology obsolescence) and internal (such as the desire to improve processes or speed up decision-making). However, putting together a BI business case is not always a straightforward process. Before embarking on a BI business case endeavor, consider that:
You may not actually need a business case. Determining whether a BI business case is necessary includes three main considerations. Is it an investment that the organization must make to stay in business, should consider because other investments are changing the organization's IT landscape, or wants to make because of expected business benefits?
A business sponsor does not obviate the need for a business case. It may be tempting to conclude that you can skip making a business case for BI whenever there is a strong push for investment from the business side, in particular when budget holders are prepared to commit money. Resist this impulse whenever possible: The resulting project will likely suffer from a lack of focus, and recriminations are likely to follow sooner or later.
Do you ever feel like you’re facing a moving target? Whether it’s the latest customer requirements, or how to improve operations, or to retain your best employees, or to price your products, the context in which you are doing business is increasingly dynamic. And, so are the tools you need to better understand that context? Everyone is talking about the promise of big data and advanced analytics, but we all know that companies struggle to reach the Holy Grail.
Data and analytics tools and the skills required to use them are changing faster than ever. Technologies that were university research projects just last year are now part of a wide range of products and services. How can firms keep up with the accelerated pace of innovation? Alas, many cannot. According to Forrester's Q3 2015 Global State Of Strategic Planning, Enterprise Architecture, And PMO Online Survey, 73% of companies understand the business value of data and aspire to be data-driven but just 29% confirm that they are actually turning data into action. Many firms report having mature data management, governance, and analytics practices, but yesterday's skills are not necessarily what they will need tomorrow — or even today.
The same goes for data sources. We all know that using external data sources enhances the insights from our business intelligence. But which data and where to get it?
With the incredible popularity of big data and Hadoop every Business Intelligence (BI) vendor wants to also be known as a "BI on Hadoop" vendor. But what they really can do is limited to a) querying HDFS data organized in HIVE tables using HiveQL or b) ingest any flat file into memory and analyze the data there. Basically, to most of the BI vendors Hadoop is just another data source. Let's now see what qualifies a BI vendor as a "Native Hadoop BI Platform". If we assume that all BI platforms have to have data extraction/integration, persistence, analytics and visualization layers, then "Native Hadoop/Spark BI Platforms" should be able to (ok, yes, I just had to add Spark)
Use Hadoop/Spark as the primary processing platform for MOST of the aforementioned functionality. The only exception is visualization layer which is not what Hadoop/Spark do.
Use distributed processing frameworks natively, such as
Generation of MapReduce and/or Spark jobs
Management of distributed processing framework jobs by YARN, etc
Note, generating Hive or SparkSQL queries does not qualify
Do declarative work in the product’s main user interface interpreted and executed on Hadoop/Spark directly. Not via a "pass through" mode.
Natively support Apache Sentry and Apache Ranger security
I am kicking off a research stream which will result in the "Text Analytics Roles & Responsibilities" doc. Before I finalize an RFI to our clients to see who/how/when/where they employ for these projects and applications, I'd like to explore what the actual roles and responsibilities are. So far we've come up with the following roles and their respective responsibilities
Business owner. The ultimate recipient of text analytics process results. So far I have
Customer intelligence analyst
Customer service/call center analyst
Competitive intelligence analyst
Product R&D analyst
Linguist/Data Scientist. Builds language and statistical rules for text mining (or modifies these from an off-the-shelf-product). Works with business owners to
Create "golden copies" of documents/content which will be used as base for text analytics
Works with data stewards and business ownes to define corporate taxonomies and lexicon
Data Steward. Owns corporate lexicon and taxonomies
Architect. Owns big data strategy and architecture (include data hubs, data warehouses, BI, etc) where unstructured data is one of the components
Developer/integrator. Develops custom built text analytics apps or embeds text analytics functionality into other applications (ERP, CRM, BI, etc)
You've done all the right things by following your enterprise vendor selection methodology. You created an RFI and sent it out to all of the vendors on your "approved" list. You then filtered out the responses based on your requirements, and sent out a detailed RFP. You created a detailed scoring methodology, reviewed the proposals, listened to the in-person presentations, and filtered out everyone but the top respondents. But you still ended up with more than one. What do you do?
If you shortlisted two or more market leaders (see Forrester's latest evaluation) I would not agonize over who has better methodologies, reference architectures, training, project execution and risk management, etc. They all have top of the line capabilities in all of the above. Rather, I'd concentrate on the following specifics
The vendor who proposed more specific named individuals to the project, and you reviewed and liked their resumes, gets an edge over a vendor who only proposed general roles to be staffed at the time of the project kick off.
Not very long ago, it would have been almost inconceivable to consider a new large-scale data analysis project in which the open source Apache Hadoop did not play a pivotal role.
Every Hadoop blog post needs a picture of an elephant. (Source: Paul Miller)
Then, as so often happens, the gushing enthusiasm became more nuanced. Hadoop, some began (wrongly) to mutter, was "just about MapReduce." Hadoop, others (not always correctly) suggested, was "slow."
Then newer tools came along. Hadoop, a growing cacophony (innacurately) trumpeted, was "not as good as Spark."
But, in the real world, Hadoop continues to be great at what it's good at. It's just not good at everything people tried throwing in its direction. We really shouldn't be surprised by this. And yet, it seems, so many of us are.
For CIOs asked to drive new programmes of work in which big data plays a part (and few are not), the competing claims in this space are both unhelpful and confusing. Hadoop and Spark are not, despite some suggestions, directly equivalent. In many cases, asking "Hadoop or Spark" is simply the wrong question.
Predictive analytics has become the key to helping businesses — especially those in the highly dynamic Chinese market — create differentiated, individualized customer experiences and make better decisions. Enterprise architecture professionals must take a customer-oriented approach to developing their predictive analytics strategy and architecture.
I’ve recently published tworeports focusing on how to architect predictive analytics capability. These reports analyze the trends around predictive analytics adoption in China and discuss four key areas that EA pros must focus on to accelerate digital transformation. They also show EA pros how to unleash the power of digital business by analyzing the predictive analytics practices of visionary Chinese firms. Some of the key takeaways:
Predictive analytics must cover the full customer life cycle and leverage business insights. Organizations require predictable insights into customer behaviors and business operations. Youmust implement predictive analytics solutions and deliver value to customers throughout their life cycle to differentiate your customer experience and sustain business growth.You should also realize the importance of business stakeholders and define effective mechanisms for translating their business knowledge into predictive algorithm inputs to optimize predictive models faster and generate deeper customer insights.