What Qualifies A BI Vendor As A Native Hadoop BI Platform?

With the incredible popularity of big data and Hadoop every Business Intelligence (BI) vendor wants to also be known as a "BI on Hadoop" vendor. But what they really can do is limited to a) querying HDFS data organized in HIVE tables using HiveQL or b) ingest any flat file into memory and analyze the data there. Basically, to most of the BI vendors Hadoop is just another data source. Let's now see what qualifies a BI vendor as a "Native Hadoop BI Platform". If we assume that all BI platforms have to have data extraction/integration, persistence, analytics and visualization layers, then "Native Hadoop/Spark BI Platforms" should be able to (ok, yes, I just had to add Spark)

 

  • Use Hadoop/Spark as the primary processing platform for MOST of the aforementioned functionality. The only exception is visualization layer which is not what Hadoop/Spark do.
  • Use distributed processing frameworks natively, such as
    • Generation of MapReduce and/or Spark jobs
    • Management of distributed processing framework jobs by YARN, etc
    • Note, generating Hive or SparkSQL queries does not qualify
  • Do declarative work in the product’s main user interface interpreted and executed on Hadoop/Spark directly. Not via a "pass through" mode.
  • Natively support Apache Sentry and Apache Ranger security
 
Did I miss anything?

What are the typical Text Analytics jobs and responsibilities?

Hi

I am kicking off a research stream which will result in the "Text Analytics Roles & Responsibilities" doc. Before I finalize an RFI to our clients to see who/how/when/where they employ for these projects and applications, I'd like to explore what the actual roles and responsibilities are. So far we've come up with the following roles and their respective responsibilities
  • Business owner. The ultimate recipient of text analytics process results. So far I have
    • Brand manager
    • Customer intelligence analyst
    • Customer service/call center analyst
    • Risk manager
    • Competitive intelligence analyst
    • Product R&D analyst
    • Anyone else?
  • Linguist/Data Scientist. Builds language and statistical rules for text mining (or modifies these from an off-the-shelf-product). Works with business owners to
    • Create "golden copies" of documents/content which will be used as base for text analytics
    • Works with data stewards and business ownes to define corporate taxonomies and lexicon
  • Data Steward. Owns corporate lexicon and taxonomies
  • Architect. Owns big data strategy and architecture (include data hubs, data warehouses, BI, etc) where unstructured data is one of the components
  • Developer/integrator. Develops custom built text analytics apps or embeds text analytics functionality into other applications (ERP, CRM, BI, etc)
  • Others?
Read more

Selecting Professional Service Provider For Your Business Intelligence/Information Management/Analytics/Big Data Projects

You've done all the right things by following your enterprise vendor selection methodology. You created an RFI and sent it out to all of the vendors on your "approved" list. You then filtered out the responses based on your requirements, and sent out a detailed RFP. You created a detailed scoring methodology, reviewed the proposals, listened to the in-person presentations, and filtered out everyone but the top respondents. But you still ended up with more than one. What do you do?

If you shortlisted two or more market leaders (see Forrester's latest evaluation)  I would not agonize over who has better methodologies, reference architectures, training, project execution and risk management, etc. They all have top of the line capabilities in all of the above. Rather, I'd concentrate on the following specifics
 
People
  • The vendor who proposed more specific named individuals to the project, and you reviewed and liked their resumes, gets an edge over a vendor who only proposed general roles to be staffed at the time of the project kick off.
Read more

Build More Effective Data Visualizations

Industry-renowned data visualization expert Edward Tufte once said: "The world is complex, dynamic, multidimensional; the paper is static, flat. How are we to represent the rich visual world of experience and measurement on mere flatland?" He's right: There's too much information out there for knowledge workers to effectively analyze — be they hands-on analysts, data scientists, or senior execs. More often than not, traditional tabular reports fail to paint the whole picture or, even worse, lead you to the wrong conclusion. AD&D pros should be aware that data visualization can help for a variety of reasons:
  • Visual information is more powerful than any other type of sensory input. Dr. John Medina asserts that vision trumps all other senses when it comes to processing information; we are incredible at remembering pictures. Pictures are also more efficient than text alone because our brain considers each word to be a very small picture and thus takes more time to process text. When we hear a piece of information, we remember 10% of it three days later; if we add a picture, we remember 65% of it. There are multiple explanations for these phenomena, including the fact that 80% to 90% of information received by the brain comes through the eyes, and about half of your brain function is dedicated directly or indirectly to processing vision.
  • We can't see patterns in numbers alone . . . Simply seeing numbers on a grid doesn't always give us the whole story — and it can even lead us to draw the wrong conclusion. Anscombe's quartet demonstrates this effectively; four groups of seemingly similar x/y coordinates reveal very different patterns when represented in a graph.
Read more

Forrester Quick Take: AWS QuickSight Will Disrupt Business Intelligence And Analytics Markets

Get ready for AWS business intelligence (BI): it's real and it packs a punch!

Today’s BI market is like a perpetual motion machine — an unstoppable engine that never seems to run out of steam. Forrester currently tracks more than 50 BI vendors, and not a month goes by without a software vendor or startup with tangential BI capabilities trying to take advantage of the craze for BI, analytics, and big data. This month is no exception: On October 7, Amazon crashed the party by announcing QuickSight, a new BI and analytics data management platform. BI pros will need to pay close attention, because this new platform is inexpensive, highly scalable, and has the potential to disrupt the BI vendor landscape. QuickSight is based on AWS’s cloud infrastructure, so it shares AWS characteristics like elasticity, abstracted complexity, and a pay-per-use consumption model. Specifically, the new QuickSight platform provides

  • New ways to get terabytes of data into AWS
  • Automatic enrichment of AWS metadata for more effective BI
  • An in-memory accelerator  (SPICE) to speed up big data analytics
  • An industrial grade data analysis and visualization platform (QuickSight), including mobile clients
  • Open APIs
Read more

The Forrester Wave™: Agile Business Intelligence Platforms, Q3 2015

Consumers (and B2B customers) are more and more empowered with mobile devices and cloud-based, all but unlimited access to information about products, services, and prices. Customer stickiness is increasingly difficult to achieve as they demand instant gratification for their ever changing tastes and requirements. Switching product and service providers is now just a matter of clicking a few keys on a mobile phone. Forrester calls this the age of the customer, which elevates business and technology priorities to achieve:

  • Business agility. Business agility often equals the ability to adopt, react, and succeed in the midst of an unending fountain of customer driven requirements. Agile organizations make decisions differently by embracing a new, more grass-roots-based management approach. Employees down in the trenches, in individual business units, are the ones who are in close touch with customer problems, market shifts, and process inefficiencies. These workers are often in the best position to understand challenges and opportunities and to make decisions to improve the business. It is only when responses to change come from these highly aware and empowered employees, that enterprises become agile, competitive, and successful.

Read more

BI and data integration professionals face a multitude of overlapping data preparation options

Ah, the good old days. The world used to be simple. ETL vendors provided data integration functionality, DBMS vendors data warehouse platforms and BI vendors concentrated on reporting, analysis and data visualization. And they all lived happily ever after without stepping on each others’ toes and benefiting from lucrative partnerships. Alas, the modern world of BI and data integration is infinitely more complex with multiple, often overlapping offerings from data integration and BI vendors. I see the following three major segments in the market of preparing data for BI:

  1. Fully functional and highly scalable ETL platforms that are used for integrating analytical data as well as moving, synchronizing and replicating operational, transactional data. This is still the realm of tech professionals who use ETL products from Informatica, AbInitio, IBM, Oracle, Microsoft and others.
  2. An emerging market of data preparation technologies that specialize mostly in integrating data for BI use cases and mostly run by business users. Notable vendors in the space include Alteryx, Paxata, Trifecta, Datawatch, Birst, and a few others.
  3. Data preparation features built right into BI platforms. Most leading BI vendors today provide such capabilities to a varying degree.
Read more

Make Your BI Environment More Agile With BI on Hadoop

In the past three decades, management information systems, data integration, data warehouses (DWs), BI, and other relevant technologies and processes only scratched the surface of turning data into useful information and actionable insights:
  • Organizations leverage less than half of their structured data for insights. The latest Forrester data and analytics survey finds that organizations use on average only 40% of their structured data for strategic decision-making. 
  • Unstructured data remains largely untapped. Organizations are even less mature in their use of unstructured data. They tap only about a third of their unstructured data sources (28% of semistructured and 31% of unstructured) for strategic decision-making. And these percentages don’t include more recent components of a 360-degree view of the customer, such as voice of the customer (VoC), social media, and the Internet of Things. 
  • BI architectures continue to become more complex. The intricacies of earlier-generation and many current business intelligence (BI) architectural stacks, which usually require the integration of dozens of components from different vendors, are just one reason it takes so long and costs so much to deliver a single version of the truth with a seamlessly integrated, centralized enterprise BI environment.
  • Existing BI architectures are not flexible enough. Most organizations take too long to get to the ultimate goal of a centralized BI environment, and by the time they think they are done, there are new data sources, new regulations, and new customer needs, which all require more changes to the BI environment.
Read more

Don't Throw Hadoop At Every BI Challenge

The explosion of data and fast-changing customer needs have led many companies to a realization: They must constantly improve their capabilities, competencies, and culture in order to turn data into business value. But how do Business Intelligence (BI) professionals know whether they must modernize their platforms or whether their main challenges are mostly about culture, people, and processes?

"Our BI environment is only used for reporting — we need big data for analytics."

"Our data warehouse takes very long to build and update — we were told we can replace it with Hadoop."

These are just some of the conversations that Forrester clients initiate, believing they require a big data solution. But after a few probing questions, companies realize that they may need to upgrade their outdated BI platform, switch to a different database architecture, add extra nodes to their data warehouse (DW) servers, improve their data quality and data governance processes, or other commonsense solutions to their challenges, where new big data technologies may be one of the options, but not the only one, and sometimes not the best. Rather than incorrectly assuming that big data is the panacea for all issues associated with poorly architected and deployed BI environments, BI pros should follow the guidelines in the Forrester recent report to decide whether their BI environment needs a healthy dose of upgrades and process improvements or whether it requires different big data technologies. Here are some of the findings and recommendations from the full research report:

1) Hadoop won't solve your cultural challenges

Read more

Hit the road running with a new BI initiative

Even though Business Intelligence applications have been out there for decades lots of people still struggle with “how do I get started with BI”. I constantly deal with clients who mistakenly start their BI journey by selecting a BI platform or not thinking about the data architecture. I know it’s a HUGE oversimplification but in a nutshell here’s a simple roadmap (for a more complete roadmap please see the Roadmap document in Forrester BI Playbook) that will ensure that your BI strategy is aligned with your business strategy and you will hit the road running. The best way to start, IMHO, is from the performance management point of view:

  1. Catalog your organization business units and departments
  2. For each business unit /department ask questions about their business strategy and objectives
  3. Then ask about what goals do they set for themselves in order achieve the objectives
  4. Next ask what metrics and indicators do they use to track where they are against their goals and objectives. Good rule of thumb: no business area, department needs to track more than 20 to 30 metrics. More than that is unmanageable.
  5. Then ask questions how they would like to slice/dice these metrics (by time period, by region, by business unit, by customer segment, etc)
Read more