With the incredible popularity of big data and Hadoop every Business Intelligence (BI) vendor wants to also be known as a "BI on Hadoop" vendor. But what they really can do is limited to a) querying HDFS data organized in HIVE tables using HiveQL or b) ingest any flat file into memory and analyze the data there. Basically, to most of the BI vendors Hadoop is just another data source. Let's now see what qualifies a BI vendor as a "Native Hadoop BI Platform". If we assume that all BI platforms have to have data extraction/integration, persistence, analytics and visualization layers, then "Native Hadoop/Spark BI Platforms" should be able to (ok, yes, I just had to add Spark)
Use Hadoop/Spark as the primary processing platform for MOST of the aforementioned functionality. The only exception is visualization layer which is not what Hadoop/Spark do.
Use distributed processing frameworks natively, such as
Generation of MapReduce and/or Spark jobs
Management of distributed processing framework jobs by YARN, etc
Note, generating Hive or SparkSQL queries does not qualify
Do declarative work in the product’s main user interface interpreted and executed on Hadoop/Spark directly. Not via a "pass through" mode.
Natively support Apache Sentry and Apache Ranger security
You've done all the right things by following your enterprise vendor selection methodology. You created an RFI and sent it out to all of the vendors on your "approved" list. You then filtered out the responses based on your requirements, and sent out a detailed RFP. You created a detailed scoring methodology, reviewed the proposals, listened to the in-person presentations, and filtered out everyone but the top respondents. But you still ended up with more than one. What do you do?
If you shortlisted two or more market leaders (see Forrester's latest evaluation) I would not agonize over who has better methodologies, reference architectures, training, project execution and risk management, etc. They all have top of the line capabilities in all of the above. Rather, I'd concentrate on the following specifics
The vendor who proposed more specific named individuals to the project, and you reviewed and liked their resumes, gets an edge over a vendor who only proposed general roles to be staffed at the time of the project kick off.
I joined Forrester recently as a senior forecast analyst on the ForecastView team focusing on business technology (BT) topics. What is ForecastView you ask? It’s a Forrester product that puts the numbers around our research reports by publishing a five-year quantitative outlook. To learn how our forecasts can help you with your investment decisions, read our ForecastView overview.
Our BT forecast team takes a look at cloud, security, IoT, business intelligence, marketing ad technology, Big Data, and other hot topics in the BT space. We launched our ForecastView BT bundle in 2015. In case you missed it, our three 2015 forecasts examined eCommerce platforms, cloud security, and API management. Some highlights:
Sizing The Cloud Security Market: Companies will spend $2 billion over the next five years to protect data in the cloud. We expect the market to grow at a staggering 40%+ CAGR over the next five years.
Modern application delivery leaders realize that their primary goal is to deliver value to the business and its customers faster. Most of the modern successful change frameworks, like Agile (in its various instantiations), Lean, and Lean Startup, which inspire developers and development shops, put metrics and measurement at the center of improvement and feedback loops. The objective of controlling and governing projects to meet vaguely estimated efforts but precisely defined budgets as well as unrealistic deadlines is no longer on the agenda of leading BT organizations.
The new objective of BT organizations is to connect more linearly the work that app dev teams do and the results they produce to deliver business outcomes. In this context, application development and delivery (AD&D) leaders need a new set of metrics that help them monitor and improve the value they deliver, based on feedback from business partners and customers.
Preproduction metrics. Leading organizations capture preproduction data on activities and milestones through productivity metrics, but they place a growing emphasis on the predictability of the continuous delivery pipeline, quality, and value.
Three of four architects strive to make their firms data driven. But well-meaning technology managers only deal with part of the problem: How to use technology to glean deeper, faster insight from more data -- and more cheaply. But consider that only 29% of architects say their firms are good at connecting analytics results to business outcome. This is a huge gap! And the problem is the ‘data driven’ mentality that never fights it’s way out of technology and to what firms care about - outcomes.
In 2016, customer-obsessed leaders will leapfrog their competition, and we will see a shift as firms seek to grow revenue and transform customer experiences. Insight will become a key competitive weapon, as firms move beyond big data and solve problems with data driven thinking.
Shift #1 - Data and analytics energy will continue drive incremental improvement
In 2016, the energy around data-driven investments will continue to elevate the importance of data and create incremental improvement in business performance. In 2016, Forrester predicts:
Chief data officers will gain power, prestige and presence...for now. But the long term viability of the role is unclear. Certain types of businesses, like digital natives, won’t benefit from appointing a CDO.
Machine learning will reduce the insight killer - time. Machine learning will replace manual data wrangling and data governance dirty work. The freeing up of time will accelerate data strategies.
You can't bring up semantics without someone inserting an apology for the geekiness of the discussion. If you're a data person like me, geek away! But for everyone else, it's a topic best left alone. Well, like every geek, the semantic geeks now have their day — and may just rule the data world.
It begins with a seemingly innocent set of questions:
"Is there a better way to master my data?"
"Is there a better way to understand the data I have?"
"Is there a better way to bring data and content together?"
"Is there a better way to personalize data and insight to be relevant?"
Semantics discussions today are born out of the data chaos that our traditional data management and governance capabilities are struggling under. They're born out of the fact that even with the best big data technology and analytics being adopted, business stakeholder satisfaction with analytics has decreased by 21% from 2014 to 2015, according to Forrester's Global Business Technographics® Data And Analytics Survey, 2015. Innovative data architects and vendors realize that semantics is the key to bringing context and meaning to our information so we can extract those much-needed business insights, at scale, and more importantly, personalized.
Industry-renowned data visualization expert Edward Tufte once said: "The world is complex, dynamic, multidimensional; the paper is static, flat. How are we to represent the rich visual world of experience and measurement on mere flatland?" He's right: There's too much information out there for knowledge workers to effectively analyze — be they hands-on analysts, data scientists, or senior execs. More often than not, traditional tabular reports fail to paint the whole picture or, even worse, lead you to the wrong conclusion. AD&D pros should be aware that data visualization can help for a variety of reasons:
Visual information is more powerful than any other type of sensory input. Dr. John Medina asserts that vision trumps all other senses when it comes to processing information; we are incredible at remembering pictures. Pictures are also more efficient than text alone because our brain considers each word to be a very small picture and thus takes more time to process text. When we hear a piece of information, we remember 10% of it three days later; if we add a picture, we remember 65% of it. There are multiple explanations for these phenomena, including the fact that 80% to 90% of information received by the brain comes through the eyes, and about half of your brain function is dedicated directly or indirectly to processing vision.
We can't see patterns in numbers alone . . . Simply seeing numbers on a grid doesn't always give us the whole story — and it can even lead us to draw the wrong conclusion. Anscombe's quartet demonstrates this effectively; four groups of seemingly similar x/y coordinates reveal very different patterns when represented in a graph.
Get ready for AWS business intelligence (BI): it's real and it packs a punch!
Today’s BI market is like a perpetual motion machine — an unstoppable engine that never seems to run out of steam. Forrester currently tracks more than 50 BI vendors, and not a month goes by without a software vendor or startup with tangential BI capabilities trying to take advantage of the craze for BI, analytics, and big data. This month is no exception: On October 7, Amazon crashed the party by announcing QuickSight, a new BI and analytics data management platform. BI pros will need to pay close attention, because this new platform is inexpensive, highly scalable, and has the potential to disrupt the BI vendor landscape. QuickSight is based on AWS’s cloud infrastructure, so it shares AWS characteristics like elasticity, abstracted complexity, and a pay-per-use consumption model. Specifically, the new QuickSight platform provides
New ways to get terabytes of data into AWS
Automatic enrichment of AWS metadata for more effective BI
An in-memory accelerator (SPICE) to speed up big data analytics
An industrial grade data analysis and visualization platform (QuickSight), including mobile clients
Consumers (and B2B customers) are more and more empowered with mobile devices and cloud-based, all but unlimited access to information about products, services, and prices. Customer stickiness is increasingly difficult to achieve as they demand instant gratification for their ever changing tastes and requirements. Switching product and service providers is now just a matter of clicking a few keys on a mobile phone. Forrester calls this the age of the customer, which elevates business and technology priorities to achieve:
Business agility.Business agility often equals the ability to adopt, react, and succeed in the midst of an unending fountain of customer driven requirements. Agile organizations make decisions differently by embracing a new, more grass-roots-based management approach. Employees down in the trenches, in individual business units, are the ones who are in close touch with customer problems, market shifts, and process inefficiencies. These workers are often in the best position to understand challenges and opportunities and to make decisions to improve the business. It is only when responses to change come from these highly aware and empowered employees, that enterprises become agile, competitive, and successful.
Ah, the good old days. The world used to be simple. ETL vendors provided data integration functionality, DBMS vendors data warehouse platforms and BI vendors concentrated on reporting, analysis and data visualization. And they all lived happily ever after without stepping on each others’ toes and benefiting from lucrative partnerships. Alas, the modern world of BI and data integration is infinitely more complex with multiple, often overlapping offerings from data integration and BI vendors. I see the following three major segments in the market of preparing data for BI:
Fully functional and highly scalable ETL platforms that are used for integrating analytical data as well as moving, synchronizing and replicating operational, transactional data. This is still the realm of tech professionals who use ETL products from Informatica, AbInitio, IBM, Oracle, Microsoft and others.
An emerging market of data preparation technologies that specialize mostly in integrating data for BI use cases and mostly run by business users. Notable vendors in the space include Alteryx, Paxata, Trifecta, Datawatch, Birst, and a few others.
Data preparation features built right into BI platforms. Most leading BI vendors today provide such capabilities to a varying degree.