By James Kobielus
Business is all about placing bets and knowing if the odds are in your favor.
As I noted in my most recent Forrester report, business success depends on your company being able to visualize likely futures and take appropriate actions as soon as possible. You must be able to predict future scenarios well enough to prepare plans and deploy resources so that you can seize opportunities, neutralize threats, and mitigate risks.
Clearly, predictive analytics can play a pivotal role in the
day-to-day operation of your business. It can help you focus strategy and
continually tweak plans based on actual performance and likely future scenarios.
And, as I noted in a recent
Forrester blog post, the technology
can sit at the core of your service-oriented architecture (SOA) strategy as you
embed predictive logic deeply into data warehouses, business process management
platforms, complex event processing streams, and operational applications.
The grand promise of predictive analytics—still largely unrealized in most companies—is that it will become ubiquitous, guiding all decisions, transactions, and applications. For the technology to rise to that challenge, organizations must move toward a comprehensive advanced analytics strategy that integrates data mining, content analytics, and in-database analytics. Already, we’ve sketched out a vision of “Service-Oriented Analytics,” under which you break down silos among data mining and content analytics initiatives and leverage these pooled resources across all business processes.
You may agree that this is the right vision but have doubt about whether there is a practical, incremental roadmap for taking your company in that direction. In fact there is, and it starts with re-assessing the core of most companies’ predictive analytics capability: your data mining tools. As you plan your predictive analytics initiatives, you should avoid the traditional approach of focusing on tactical, bottom-up project-specific requirements. You should also try not to shoehorn your requirements into the limited feature set of whatever modeling tool you currently happen to use.
To become a fully predictive enterprise, you will need to take both a top-down and bottom-up approach to your data mining initiatives. From the top-down, it’s all about building and integrating alternate models of how your business environment is likely to evolve internally and externally. In our recent report on advanced analytics, Boris Evelson, Leslie Owens, and I sketched out the many business processes that can be enriched by predictive analytics.
So how do you instrument your company to become more
predictive? For starters, assess whether your analytics tools support the
following capabilities for developing, validating, and deploying predictive models:
- Model multiple business scenarios: You should be able to build complex models of multiple, linked
business scenarios across different business, process, and subject-area
domains, using such key features as strategy maps, ensemble modeling , and
champion-challenger modeling.
- Incorporate multiple information types into models: You should be able to develop models
against multiple information types, including unstructured content and
real-time event streams, while leveraging state-of-the-art algorithm in
sentiment analysis and social network analysis.
- Leverage multiple statistical algorithms and approaches in models: You should be able to develop models
using the widest, most sophisticated range of statistical and mathematical
algorithms and approaches, including regression, constraint-based
optimization, neural networks, genetic algorithms, and support vector
machines.
- Apply multiple metrics of model quality and fitness: You should be able to score and validate
model quality using multiple metrics and approaches, including quality
scores, lift charts, goodness-of-fit charts, comparative model evaluation,
and auto best-model selection.
- Employ multiple variable discovery and assessment approaches: You should be able to build and
validate models using various approaches for variable discovery,
profiling, and selection, including decision trees, feature selection,
clustering, association rules, affinity analysis, and outlier analysis.
How is this different from predictive analytics as usual? Traditionally,
most predictive modeling specialists focus on the latter three capabilities:
statistical algorithms and approaches, model quality and fitness, and variable
discovery assessment. Most models are built in narrowly scoped business or
subject domains—such as customer analytics for marketing campaign
management—and only against structured data sources (such as relational
tables). Traditionally, few predictive analytics projects have entailed modeling
of multiple business scenarios across diverse domains--such as sales,
marketing, customer service, manufacturing, and supply chain-- though in the
real world these business processes are often quite interconnected. Also, many data
mining initiatives fail to incorporate information from unstructured sources—such
as text in call-center logs—though this content may be as important as what
comes relational databases and other structured sources.
It’s very important to build multi-scenario predictive models against
complex information sets, but becoming a fully predictive enterprise demands
much more. To instrument your organization for maximum predictive power, you
should also tool your advanced analytics to support the following capabilities:
·
DW-integrated data preparation: To speed up and standardize the most
time-consuming predictive modeling project tasks, you should be able to
leverage your existing data warehouse, extract transform load, data quality,
and metadata tools to support a full range of data preparation features. These
features include the ability to discover, acquire, capture, profile, sample, collect,
collate, aggregate, deduplicate, transform, correct, augment, and load
analytical data sets.
·
Deep application and middleware integration: To deliver models deeply into whatever heterogeneous
SOA-enabled platform you happen to use, your predictive analytics tool should deploy
on and/or integrate with a wide range of enterprise applications, middleware,
operating platforms, and hardware substrate. You should be able to deploy
models seamlessly into your data warehouse, business intelligence, online
analytical processing, data integration, complex event processing, data quality,
master data management, and business process management environments. And to
play well in your SOA, your predictive modeling tools should support application
programming interfaces, languages, tools, and approaches such as Web services,
Java, C++, and Visual Studio, as well as emerging languages such as
SQL-MapReduce and R.
·
Consistent cross-domain model governance: To avoid fostering an unmanageable glut of
myriad models, your predictive analytics solution should support a wide range
of tools, features, and interfaces to support life-cycle governance of models
created in diverse tools. At the very least, your tools should enable model
check in/check-out, change tracking, version control, and collaborative
development and validation of models. To realize this promise, it should
support a full range of tools, standards, and interfaces for import and
embedding of models from other tools, as well as export and sharing of models to
other environments.
·
Flexible model deployment: To execute modeling functions--such as
data preparation, regression, and scoring—on the widest range of data
warehouses and other platforms, your tools should support in-database or
embedded analytics. And to scale to the max, your predictive analytics tools should
deploy models to massively parallel data warehouses, software-as-a-service environments,
and cloud computing fabrics. Your
advanced analytics tools should also support development of application logic
in open frameworks—such as MapReduce and Hadoop—to enable convergence of data
mining and content analytics in the cloud.
·
Rich interactive visualization: To deliver their precious
payload—actionable intelligence—your advanced analytics tools should support
interactive visualization of models, data, and results. Ideally, you should be
able to visualize all of this in your preferred business intelligence tool, or
in the predictive modeling vendor’s integrated visualization layer. Of course,
you have every right to expect the full range of visualization
techniques--histograms, box plots, heat maps, etc.—regardless of who provides
the visualization layer.
As you can see, this
goes well beyond data mining as usual. Forrester has a slightly different
perspective on the development of the predictive analytics market than you’re
likely to get from other sources. We see a robust, flexible, SOA-enabled data
mining tools as the centerpiece of advanced analytics for fully predictive
enterprises. The competitive stakes are too great for businesses to take the traditional
silo-mired approach when implementing this mission-critical technology.
What do you think?
