With the incredible popularity of big data and Hadoop every Business Intelligence (BI) vendor wants to also be known as a "BI on Hadoop" vendor. But what they really can do is limited to a) querying HDFS data organized in HIVE tables using HiveQL or b) ingest any flat file into memory and analyze the data there. Basically, to most of the BI vendors Hadoop is just another data source. Let's now see what qualifies a BI vendor as a "Native Hadoop BI Platform". If we assume that all BI platforms have to have data extraction/integration, persistence, analytics and visualization layers, then "Native Hadoop/Spark BI Platforms" should be able to (ok, yes, I just had to add Spark)
Use Hadoop/Spark as the primary processing platform for MOST of the aforementioned functionality. The only exception is visualization layer which is not what Hadoop/Spark do.
Use distributed processing frameworks natively, such as
Generation of MapReduce and/or Spark jobs
Management of distributed processing framework jobs by YARN, etc
Note, generating Hive or SparkSQL queries does not qualify
Do declarative work in the product’s main user interface interpreted and executed on Hadoop/Spark directly. Not via a "pass through" mode.
Natively support Apache Sentry and Apache Ranger security
I am kicking off a research stream which will result in the "Text Analytics Roles & Responsibilities" doc. Before I finalize an RFI to our clients to see who/how/when/where they employ for these projects and applications, I'd like to explore what the actual roles and responsibilities are. So far we've come up with the following roles and their respective responsibilities
Business owner. The ultimate recipient of text analytics process results. So far I have
Customer intelligence analyst
Customer service/call center analyst
Competitive intelligence analyst
Product R&D analyst
Linguist/Data Scientist. Builds language and statistical rules for text mining (or modifies these from an off-the-shelf-product). Works with business owners to
Create "golden copies" of documents/content which will be used as base for text analytics
Works with data stewards and business ownes to define corporate taxonomies and lexicon
Data Steward. Owns corporate lexicon and taxonomies
Architect. Owns big data strategy and architecture (include data hubs, data warehouses, BI, etc) where unstructured data is one of the components
Developer/integrator. Develops custom built text analytics apps or embeds text analytics functionality into other applications (ERP, CRM, BI, etc)
You've done all the right things by following your enterprise vendor selection methodology. You created an RFI and sent it out to all of the vendors on your "approved" list. You then filtered out the responses based on your requirements, and sent out a detailed RFP. You created a detailed scoring methodology, reviewed the proposals, listened to the in-person presentations, and filtered out everyone but the top respondents. But you still ended up with more than one. What do you do?
If you shortlisted two or more market leaders (see Forrester's latest evaluation) I would not agonize over who has better methodologies, reference architectures, training, project execution and risk management, etc. They all have top of the line capabilities in all of the above. Rather, I'd concentrate on the following specifics
The vendor who proposed more specific named individuals to the project, and you reviewed and liked their resumes, gets an edge over a vendor who only proposed general roles to be staffed at the time of the project kick off.
I joined Forrester recently as a senior forecast analyst on the ForecastView team focusing on business technology (BT) topics. What is ForecastView you ask? It’s a Forrester product that puts the numbers around our research reports by publishing a five-year quantitative outlook. To learn how our forecasts can help you with your investment decisions, read our ForecastView overview.
Our BT forecast team takes a look at cloud, security, IoT, business intelligence, marketing ad technology, Big Data, and other hot topics in the BT space. We launched our ForecastView BT bundle in 2015. In case you missed it, our three 2015 forecasts examined eCommerce platforms, cloud security, and API management. Some highlights:
Sizing The Cloud Security Market: Companies will spend $2 billion over the next five years to protect data in the cloud. We expect the market to grow at a staggering 40%+ CAGR over the next five years.
Industry-renowned data visualization expert Edward Tufte once said: "The world is complex, dynamic, multidimensional; the paper is static, flat. How are we to represent the rich visual world of experience and measurement on mere flatland?" He's right: There's too much information out there for knowledge workers to effectively analyze — be they hands-on analysts, data scientists, or senior execs. More often than not, traditional tabular reports fail to paint the whole picture or, even worse, lead you to the wrong conclusion. AD&D pros should be aware that data visualization can help for a variety of reasons:
Visual information is more powerful than any other type of sensory input. Dr. John Medina asserts that vision trumps all other senses when it comes to processing information; we are incredible at remembering pictures. Pictures are also more efficient than text alone because our brain considers each word to be a very small picture and thus takes more time to process text. When we hear a piece of information, we remember 10% of it three days later; if we add a picture, we remember 65% of it. There are multiple explanations for these phenomena, including the fact that 80% to 90% of information received by the brain comes through the eyes, and about half of your brain function is dedicated directly or indirectly to processing vision.
We can't see patterns in numbers alone . . . Simply seeing numbers on a grid doesn't always give us the whole story — and it can even lead us to draw the wrong conclusion. Anscombe's quartet demonstrates this effectively; four groups of seemingly similar x/y coordinates reveal very different patterns when represented in a graph.
The hordes gathered in Las Vegas this week, for Amazon's latest re:Invent show. Over 18,000 individuals queued to get into sessions, jostled to reach the Oreo Cookie Popcorn (yes, really), and dodged casino-goers to hear from AWS, its partners and its customers. Las Vegas may figure nowhere on my list of favourite places, but the programme of Analyst sessions AWS laid on for earlier in the week definitely justified this trip.
The headline items (the Internet of Things, Business Intelligence, and a Snowball chucked straight at the 'hell' that is the enterprise data centre (think about it)) are much-discussed, but in many ways the more interesting stuff was AWS' continued - quiet, methodical, inexorable - improvement of its current offerings. One by one, enterprise 'reasons' to avoid AWS or its public cloud competitors are being systematically demolished.
Get ready for AWS business intelligence (BI): it's real and it packs a punch!
Today’s BI market is like a perpetual motion machine — an unstoppable engine that never seems to run out of steam. Forrester currently tracks more than 50 BI vendors, and not a month goes by without a software vendor or startup with tangential BI capabilities trying to take advantage of the craze for BI, analytics, and big data. This month is no exception: On October 7, Amazon crashed the party by announcing QuickSight, a new BI and analytics data management platform. BI pros will need to pay close attention, because this new platform is inexpensive, highly scalable, and has the potential to disrupt the BI vendor landscape. QuickSight is based on AWS’s cloud infrastructure, so it shares AWS characteristics like elasticity, abstracted complexity, and a pay-per-use consumption model. Specifically, the new QuickSight platform provides
New ways to get terabytes of data into AWS
Automatic enrichment of AWS metadata for more effective BI
An in-memory accelerator (SPICE) to speed up big data analytics
An industrial grade data analysis and visualization platform (QuickSight), including mobile clients
Consumers (and B2B customers) are more and more empowered with mobile devices and cloud-based, all but unlimited access to information about products, services, and prices. Customer stickiness is increasingly difficult to achieve as they demand instant gratification for their ever changing tastes and requirements. Switching product and service providers is now just a matter of clicking a few keys on a mobile phone. Forrester calls this the age of the customer, which elevates business and technology priorities to achieve:
Business agility.Business agility often equals the ability to adopt, react, and succeed in the midst of an unending fountain of customer driven requirements. Agile organizations make decisions differently by embracing a new, more grass-roots-based management approach. Employees down in the trenches, in individual business units, are the ones who are in close touch with customer problems, market shifts, and process inefficiencies. These workers are often in the best position to understand challenges and opportunities and to make decisions to improve the business. It is only when responses to change come from these highly aware and empowered employees, that enterprises become agile, competitive, and successful.
Ah, the good old days. The world used to be simple. ETL vendors provided data integration functionality, DBMS vendors data warehouse platforms and BI vendors concentrated on reporting, analysis and data visualization. And they all lived happily ever after without stepping on each others’ toes and benefiting from lucrative partnerships. Alas, the modern world of BI and data integration is infinitely more complex with multiple, often overlapping offerings from data integration and BI vendors. I see the following three major segments in the market of preparing data for BI:
Fully functional and highly scalable ETL platforms that are used for integrating analytical data as well as moving, synchronizing and replicating operational, transactional data. This is still the realm of tech professionals who use ETL products from Informatica, AbInitio, IBM, Oracle, Microsoft and others.
An emerging market of data preparation technologies that specialize mostly in integrating data for BI use cases and mostly run by business users. Notable vendors in the space include Alteryx, Paxata, Trifecta, Datawatch, Birst, and a few others.
Data preparation features built right into BI platforms. Most leading BI vendors today provide such capabilities to a varying degree.
In the past three decades, management information systems, data integration, data warehouses (DWs), BI, and other relevant technologies and processes only scratched the surface of turning data into useful information and actionable insights:
Organizations leverage less than half of their structured data for insights. The latest Forrester data and analytics survey finds that organizations use on average only 40% of their structured data for strategic decision-making.
Unstructured data remains largely untapped. Organizations are even less mature in their use of unstructured data. They tap only about a third of their unstructured data sources (28% of semistructured and 31% of unstructured) for strategic decision-making. And these percentages don’t include more recent components of a 360-degree view of the customer, such as voice of the customer (VoC), social media, and the Internet of Things.
BI architectures continue to become more complex. The intricacies of earlier-generation and many current business intelligence (BI) architectural stacks, which usually require the integration of dozens of components from different vendors, are just one reason it takes so long and costs so much to deliver a single version of the truth with a seamlessly integrated, centralized enterprise BI environment.
Existing BI architectures are not flexible enough. Most organizations take too long to get to the ultimate goal of a centralized BI environment, and by the time they think they are done, there are new data sources, new regulations, and new customer needs, which all require more changes to the BI environment.