With the incredible popularity of big data and Hadoop every Business Intelligence (BI) vendor wants to also be known as a "BI on Hadoop" vendor. But what they really can do is limited to a) querying HDFS data organized in HIVE tables using HiveQL or b) ingest any flat file into memory and analyze the data there. Basically, to most of the BI vendors Hadoop is just another data source. Let's now see what qualifies a BI vendor as a "Native Hadoop BI Platform". If we assume that all BI platforms have to have data extraction/integration, persistence, analytics and visualization layers, then "Native Hadoop/Spark BI Platforms" should be able to (ok, yes, I just had to add Spark)
Use Hadoop/Spark as the primary processing platform for MOST of the aforementioned functionality. The only exception is visualization layer which is not what Hadoop/Spark do.
Use distributed processing frameworks natively, such as
Generation of MapReduce and/or Spark jobs
Management of distributed processing framework jobs by YARN, etc
Note, generating Hive or SparkSQL queries does not qualify
Do declarative work in the product’s main user interface interpreted and executed on Hadoop/Spark directly. Not via a "pass through" mode.
Natively support Apache Sentry and Apache Ranger security
I am kicking off a research stream which will result in the "Text Analytics Roles & Responsibilities" doc. Before I finalize an RFI to our clients to see who/how/when/where they employ for these projects and applications, I'd like to explore what the actual roles and responsibilities are. So far we've come up with the following roles and their respective responsibilities
Business owner. The ultimate recipient of text analytics process results. So far I have
Customer intelligence analyst
Customer service/call center analyst
Competitive intelligence analyst
Product R&D analyst
Linguist/Data Scientist. Builds language and statistical rules for text mining (or modifies these from an off-the-shelf-product). Works with business owners to
Create "golden copies" of documents/content which will be used as base for text analytics
Works with data stewards and business ownes to define corporate taxonomies and lexicon
Data Steward. Owns corporate lexicon and taxonomies
Architect. Owns big data strategy and architecture (include data hubs, data warehouses, BI, etc) where unstructured data is one of the components
Developer/integrator. Develops custom built text analytics apps or embeds text analytics functionality into other applications (ERP, CRM, BI, etc)
You've done all the right things by following your enterprise vendor selection methodology. You created an RFI and sent it out to all of the vendors on your "approved" list. You then filtered out the responses based on your requirements, and sent out a detailed RFP. You created a detailed scoring methodology, reviewed the proposals, listened to the in-person presentations, and filtered out everyone but the top respondents. But you still ended up with more than one. What do you do?
If you shortlisted two or more market leaders (see Forrester's latest evaluation) I would not agonize over who has better methodologies, reference architectures, training, project execution and risk management, etc. They all have top of the line capabilities in all of the above. Rather, I'd concentrate on the following specifics
The vendor who proposed more specific named individuals to the project, and you reviewed and liked their resumes, gets an edge over a vendor who only proposed general roles to be staffed at the time of the project kick off.
Not very long ago, it would have been almost inconceivable to consider a new large-scale data analysis project in which the open source Apache Hadoop did not play a pivotal role.
Every Hadoop blog post needs a picture of an elephant. (Source: Paul Miller)
Then, as so often happens, the gushing enthusiasm became more nuanced. Hadoop, some began (wrongly) to mutter, was "just about MapReduce." Hadoop, others (not always correctly) suggested, was "slow."
Then newer tools came along. Hadoop, a growing cacophony (innacurately) trumpeted, was "not as good as Spark."
But, in the real world, Hadoop continues to be great at what it's good at. It's just not good at everything people tried throwing in its direction. We really shouldn't be surprised by this. And yet, it seems, so many of us are.
For CIOs asked to drive new programmes of work in which big data plays a part (and few are not), the competing claims in this space are both unhelpful and confusing. Hadoop and Spark are not, despite some suggestions, directly equivalent. In many cases, asking "Hadoop or Spark" is simply the wrong question.
Predictive analytics has become the key to helping businesses — especially those in the highly dynamic Chinese market — create differentiated, individualized customer experiences and make better decisions. Enterprise architecture professionals must take a customer-oriented approach to developing their predictive analytics strategy and architecture.
I’ve recently published tworeports focusing on how to architect predictive analytics capability. These reports analyze the trends around predictive analytics adoption in China and discuss four key areas that EA pros must focus on to accelerate digital transformation. They also show EA pros how to unleash the power of digital business by analyzing the predictive analytics practices of visionary Chinese firms. Some of the key takeaways:
Predictive analytics must cover the full customer life cycle and leverage business insights. Organizations require predictable insights into customer behaviors and business operations. Youmust implement predictive analytics solutions and deliver value to customers throughout their life cycle to differentiate your customer experience and sustain business growth.You should also realize the importance of business stakeholders and define effective mechanisms for translating their business knowledge into predictive algorithm inputs to optimize predictive models faster and generate deeper customer insights.
Instinctively we know that it is not just about collecting the data. Big and bigger doesn’t necessarily make you smart and smarter. It just makes you one of those pack rats that has piles of stuff in all corners of your house. Yes, it might be very well organized and could have a potential use that makes it work keeping. But will you ever take it out and use it? Will you ever really benefit from what you’ve so painstakingly collected? Likely not.
Don’t be a data pack rat. This is the year to turn your data into actions and positive business outcomes.
In 2016, the energy around data-driven investments will continue to elevate the importance of data and create incremental improvement in business performance for many but some serious digital disruption for others. Here are a few of our data predictions for 2016.
Three of four architects strive to make their firms data driven. But well-meaning technology managers only deal with part of the problem: How to use technology to glean deeper, faster insight from more data -- and more cheaply. But consider that only 29% of architects say their firms are good at connecting analytics results to business outcome. This is a huge gap! And the problem is the ‘data driven’ mentality that never fights it’s way out of technology and to what firms care about - outcomes.
In 2016, customer-obsessed leaders will leapfrog their competition, and we will see a shift as firms seek to grow revenue and transform customer experiences. Insight will become a key competitive weapon, as firms move beyond big data and solve problems with data driven thinking.
Shift #1 - Data and analytics energy will continue drive incremental improvement
In 2016, the energy around data-driven investments will continue to elevate the importance of data and create incremental improvement in business performance. In 2016, Forrester predicts:
Chief data officers will gain power, prestige and presence...for now. But the long term viability of the role is unclear. Certain types of businesses, like digital natives, won’t benefit from appointing a CDO.
Machine learning will reduce the insight killer - time. Machine learning will replace manual data wrangling and data governance dirty work. The freeing up of time will accelerate data strategies.
I recently attended IBM BusinessConnect 2015 in Germany. I had great discussions regarding industrial Internet of Things (IoT) and Industrie 4.0 solutions as well as digital transformation in the B2B segment. One issue that particularly caught my attention: edge computing in the context of the mobile IoT.
Mobility in the IoT context raises the question when to use a central computing approach versus when to use edge computing. The CIO must decide whether solution intelligence should primarily reside in a central location or at the edge of the network and therefore closer to (or even inside) mobile IoT devices like cars, smart watches, or smart meters. At least three factors should guide this decision:
Data transmission costs. The costs of data transmission can quickly undermine any mobile IoT business case. For instance, aircraft engine sensors collect massive amounts of data during a flight but send only a small fraction of that data in real time via satellite connectivity to a central data monitoring center while the plane is in the air. All other data is sent via Wi-Fi or traditional mobile broadband connectivity like UMTS or LTE once the plane is on the ground.
Mobile bandwidth, latency, and speed. The available bandwidth limits the amount of data that can be transmitted at any given time, limiting the use cases for mobile IoT. For instance, sharing large volumes of data about the turbines of a large container ship and detailed inventory measurements of each container on board is completely impractical unless the ship is close to a coastal area with high mobile broadband connectivity.
Get ready for AWS business intelligence (BI): it's real and it packs a punch!
Today’s BI market is like a perpetual motion machine — an unstoppable engine that never seems to run out of steam. Forrester currently tracks more than 50 BI vendors, and not a month goes by without a software vendor or startup with tangential BI capabilities trying to take advantage of the craze for BI, analytics, and big data. This month is no exception: On October 7, Amazon crashed the party by announcing QuickSight, a new BI and analytics data management platform. BI pros will need to pay close attention, because this new platform is inexpensive, highly scalable, and has the potential to disrupt the BI vendor landscape. QuickSight is based on AWS’s cloud infrastructure, so it shares AWS characteristics like elasticity, abstracted complexity, and a pay-per-use consumption model. Specifically, the new QuickSight platform provides
New ways to get terabytes of data into AWS
Automatic enrichment of AWS metadata for more effective BI
An in-memory accelerator (SPICE) to speed up big data analytics
An industrial grade data analysis and visualization platform (QuickSight), including mobile clients
Tomorrow Forrester will host our Geneva-based clients for a breakfast meeting and discussion on “Powering Innovation Strategies with Insights.” My colleague, Luca Paderni, will kick off the morning with a presentation on digital disruption in the age of the customer, specifically looking at how to take a pragmatic approach to innovation with the “adjacent possible.” Then I will lead a discussion on how to build an action-oriented approach to data and analytics, exploring examples of companies that have successfully turned their data into new business opportunities – into data-derived innovation.
Thanks to Forrester’s Business Technographics, we know that business and technology leaders prioritize initiatives that secure their position in the age of the customer – to improve customer experience, address rising customer expectations, and improve their products and services (kind of all the same thing, or very closely related). It’s all about the customer. But when we ask about these priorities, the one that comes next – right after the customer-focused initiatives – is innovation: “improving our ability to innovate.” They know that the disruptions they face in the age of the customer won’t be addressed with business as usual (BAU as one of my clients referred to it yesterday; I learned a new TLA). Innovation has been elevated to an initiative, which means that executives are focused on it and likely someone is in-charge of it – we’ll come back to that one.