Instrumenting Your Enterprise for Maximum Predictive Power

James G. Kobielus By James Kobielus

Business is all about placing bets and knowing if the odds are in your favor.

As I noted in my most recent Forrester report, business success depends on your company being able to visualize likely futures and take appropriate actions as soon as possible. You must be able to predict future scenarios well enough to prepare plans and deploy resources so that you can seize opportunities, neutralize threats, and mitigate risks.

Clearly, predictive analytics can play a pivotal role in the day-to-day operation of your business. It can help you focus strategy and continually tweak plans based on actual performance and likely future scenarios. And, as I noted in a recent Forrester blog post, the technology can sit at the core of your service-oriented architecture (SOA) strategy as you embed predictive logic deeply into data warehouses, business process management platforms, complex event processing streams, and operational applications.

The grand promise of predictive analytics—still largely unrealized in most companies—is that it will become ubiquitous, guiding all decisions, transactions, and applications. For the technology to rise to that challenge, organizations must move toward a comprehensive advanced analytics strategy that integrates data mining, content analytics, and in-database analytics. Already, we’ve sketched out a vision of “Service-Oriented Analytics,” under which you break down silos among data mining and content analytics initiatives and leverage these pooled resources across all business processes.

You may agree that this is the right vision but have doubt about whether there is a practical, incremental roadmap for taking your company in that direction.  In fact there is, and it starts with re-assessing the core of most companies’ predictive analytics capability: your data mining tools. As you plan your predictive analytics initiatives, you should avoid the traditional approach of focusing on tactical, bottom-up project-specific requirements. You should also try not to shoehorn your requirements into the limited feature set of whatever modeling tool you currently happen to use.

To become a fully predictive enterprise, you will need to take both a top-down and bottom-up approach to your data mining initiatives. From the top-down, it’s all about building and integrating alternate models of how your business environment is likely to evolve internally and externally. In our recent report on advanced analytics, Boris Evelson, Leslie Owens, and I sketched out the many business processes that can be enriched by predictive analytics.

So how do you instrument your company to become more predictive? For starters, assess whether your analytics tools support the following capabilities for developing, validating, and deploying predictive models:

  • Model multiple business scenarios: You should be able to build complex models of multiple, linked business scenarios across different business, process, and subject-area domains, using such key features as strategy maps, ensemble modeling , and champion-challenger modeling.
  • Incorporate multiple information types into models: You should be able to develop models against multiple information types, including unstructured content and real-time event streams, while leveraging state-of-the-art algorithm in sentiment analysis and social network analysis.
  • Leverage multiple statistical algorithms and approaches in models: You should be able to develop models using the widest, most sophisticated range of statistical and mathematical algorithms and approaches, including regression, constraint-based optimization, neural networks, genetic algorithms, and support vector machines.
  • Apply multiple metrics of model quality and fitness: You should be able to score and validate model quality using multiple metrics and approaches, including quality scores, lift charts, goodness-of-fit charts, comparative model evaluation, and auto best-model selection.
  • Employ multiple variable discovery and assessment approaches: You should be able to build and validate models using various approaches for variable discovery, profiling, and selection, including decision trees, feature selection, clustering, association rules, affinity analysis, and outlier analysis.

How is this different from predictive analytics as usual? Traditionally, most predictive modeling specialists focus on the latter three capabilities: statistical algorithms and approaches, model quality and fitness, and variable discovery assessment. Most models are built in narrowly scoped business or subject domains—such as customer analytics for marketing campaign management—and only against structured data sources (such as relational tables). Traditionally, few predictive analytics projects have entailed modeling of multiple business scenarios across diverse domains--such as sales, marketing, customer service, manufacturing, and supply chain-- though in the real world these business processes are often quite interconnected. Also, many data mining initiatives fail to incorporate information from unstructured sources—such as text in call-center logs—though this content may be as important as what comes relational databases and other structured sources.

It’s very important to build multi-scenario predictive models against complex information sets, but becoming a fully predictive enterprise demands much more. To instrument your organization for maximum predictive power, you should also tool your advanced analytics to support the following capabilities:

· DW-integrated data preparation: To speed up and standardize the most time-consuming predictive modeling project tasks, you should be able to leverage your existing data warehouse, extract transform load, data quality, and metadata tools to support a full range of data preparation features. These features include the ability to discover, acquire, capture, profile, sample, collect, collate, aggregate, deduplicate, transform, correct, augment, and load analytical data sets.

· Deep application and middleware integration: To deliver models deeply into whatever heterogeneous SOA-enabled platform you happen to use, your predictive analytics tool should deploy on and/or integrate with a wide range of enterprise applications, middleware, operating platforms, and hardware substrate. You should be able to deploy models seamlessly into your data warehouse, business intelligence, online analytical processing, data integration, complex event processing, data quality, master data management, and business process management environments. And to play well in your SOA, your predictive modeling tools should support application programming interfaces, languages, tools, and approaches such as Web services, Java, C++, and Visual Studio, as well as emerging languages such as SQL-MapReduce and R.

· Consistent cross-domain model governance: To avoid fostering an unmanageable glut of myriad models, your predictive analytics solution should support a wide range of tools, features, and interfaces to support life-cycle governance of models created in diverse tools. At the very least, your tools should enable model check in/check-out, change tracking, version control, and collaborative development and validation of models. To realize this promise, it should support a full range of tools, standards, and interfaces for import and embedding of models from other tools, as well as export and sharing of models to other environments.

· Flexible model deployment: To execute modeling functions--such as data preparation, regression, and scoring—on the widest range of data warehouses and other platforms, your tools should support in-database or embedded analytics. And to scale to the max, your predictive analytics tools should deploy models to massively parallel data warehouses, software-as-a-service environments, and cloud computing fabrics.  Your advanced analytics tools should also support development of application logic in open frameworks—such as MapReduce and Hadoop—to enable convergence of data mining and content analytics in the cloud.

· Rich interactive visualization: To deliver their precious payload—actionable intelligence—your advanced analytics tools should support interactive visualization of models, data, and results. Ideally, you should be able to visualize all of this in your preferred business intelligence tool, or in the predictive modeling vendor’s integrated visualization layer. Of course, you have every right to expect the full range of visualization techniques--histograms, box plots, heat maps, etc.—regardless of who provides the visualization layer.

As you can see, this goes well beyond data mining as usual. Forrester has a slightly different perspective on the development of the predictive analytics market than you’re likely to get from other sources. We see a robust, flexible, SOA-enabled data mining tools as the centerpiece of advanced analytics for fully predictive enterprises. The competitive stakes are too great for businesses to take the traditional silo-mired approach when implementing this mission-critical technology.

What do you think?


re: Instrumenting Your Enterprise for Maximum Predictive Power

Most commentaries I read about on this subject seem to consistently miss an important business element in the grand picture of predictive business intelligence. My perspective is from the asset-intensive companies such as refiners, chemical companies, pharma manufacturing, power & energy producers. In those businesses, there is nothing to analyze if the pumps aren't spinning or the gears aren't turning. Availability, utilization, and performance metrics of production assets, are rarely if ever considered in the tapestry of performance and risk analysis.

So, if you're a big oil company, and one of your refineries has experienced a mechanical failure that has cost you three million barrels of production, what kinds of business analytics could have been put in place to prevent that mechanical failure from occurring?

Or, if you're running a coal-fired power plant, and you lost seven days of power generation, not only do you lose the revenue from the lost generation, you need to BUY power to meet your commitment. You've lost a few millions in revenue and incurred a few million more in cost. Why? Because a forced air fan had dirty oil and the journal bearing failed. What part of the business intelligence tapestry does production equipment (plant) availability occupy?

re: Instrumenting Your Enterprise for Maximum Predictive Power

Marc: I agree with what you say. I addressed that in my blog, in terms of inline predictive models should sit at the core of your operational apps--especially those that manage peformance of the core fixed assets upon which your business depends. Are you saying I overlooked something important?

re: Instrumenting Your Enterprise for Maximum Predictive Power

Hi James,This is a very interesting post. Certainly having the right infrastructure in place is critical to realize the promise of predictive analytics. While companies attempt to craft the right roadmap to Analytic competence, I would suggest that before diving into tools evaluation they spend some time crafting the a comprehensive Analytics strategy: what questions do they want to answer? who needs to be involved? how can analytics best support the corporate strategies? what is the financial impact?Armed with these answers will facilitate the proper tool selection but more importantly it can help recruit sponsors/champions across the organization and provide the financial justification to implement their roadmap.Best regards,-Manuel

re: Instrumenting Your Enterprise for Maximum Predictive Power

Manuel:I agree completely. In fact, your comment gives me an opportunity to say that I'm preparing the first Forrester Wave on Predictive Analytics and Data Mining Solutions. It will be published in early Q1 2010. That Wave will help companies identify their requirements and select the best tools for their predictive modeling projects.

re: Instrumenting Your Enterprise for Maximum Predictive Power

Business process management is a very useful instrument for businesses to run smoothly and improving if needed. A business process is "a collection of related, structured activities that produce a service or product that meet the needs of a client.