Let Big Data Predictive Analytics Rock Your World

I love predictive analytics. I mean, who wouldn't want to develop an application that could help you make smart business decisions, sell more stuff, make customers happy, and avert disasters. Predictive analytics can do all that, but it is not easy. In fact, it can range from being impossible to hard depending on:

  • Causative data. The lifeblood of predictive analytics is data. Data can come from internal systems such as customer transactions or manufacturing defect data. It is often appropriate to include data from external sources such as industry market data, social networks, or statistics. Contrary to popular technology beliefs, it does not always need to be big data. It is far more important that the data contain variables that can be used to predict an effect. Having said that, the more data you have, the better chance you have of finding cause and effect. Big data no guarantee of success.
  • Data scientists. A data scientist is someone who can understand the desired business outcome, examine the data, and create hypotheses about how to establish predictive rules that can enable business outcomes such as increasing eCommerce upsell, keeping a production line running, or eliminating stock-outs. Data scientists need skills in mathematics, statistics, and often domain knowledge. Check out the solution to the 2008 Netflix Prize winner - formulas galore. That is serious science. Fortunately, many predictive analytics solutions aren’t as rigorous as this.
  • Predictive analytics software. Data scientists must evaluate their hypotheses using software tools that can apply any combination of statistics and machine learning algorithms. IBM SPSS and SAS are two well-known analytics software tools used by data scientists. R Project is a popular open source choice. I am currently evaluating 10 vendors for the Forrester Wave™ evaluating big data predictive analytics solutions, which is scheduled for publication in fall 2012. If the data is big then you may need special processing platforms such as Hadoop or in-database analytics such as Oracle Exadata.
  • Operational application. If you are lucky enough to find predictive rules, then you need to embed them in your application. Your predictive analytics software might have a way to generate code such as predictive analytics vendor KXEN. It is important that the data needed by the predictive rules is readily available. Predictive rules can also be enhanced with business rules management systems and complex event processing (CEP) platforms.

Ambitious Application Developers Can Learn This

Application development teams can differentiate themselves from their code-only peers by using predictive analytics to develop smarter apps. The computer scientist in you will love the gnarly big data and machine learning algorithms.

Good stuff.




Predicting in complex systems is a pipe dream

Mike, thanks for a sceptical look at predictive analytics. Considering the huge expense and effort, I propose that it is not producing any business benefits.

Peter Drucker rightly said: 'Knowledge is different from all other resources. It makes itself constantly obsolete, so that today's advanced knowledge is tomorrow's ignorance.' Predictive analytics produces 'tomorrow's ignorance'!!

From a philosophical and scientific perspective it is impossible to predict anything in real world, uncontrolled complex system environments. The reason we manufacture in expensive factories and we spend an incredible amount on preventive airplane maintenance and strictly control pilot flight procedures is to remove the uncertainty inherent in complex adaptive systems. That ability somehow creates the illusion that we are able to predict stuff from past data.

The truth is we don't. When things go wrong we do not improve our predictions, we remove the environment that creates uncertainty.

All that is impossible in complex, adaptive systems with emergent phenomenon - aka the economy or doing business. The complexity of interactions cannot be causally analyzed and thus not predicted. The idea the DIKW (data, information, knowledge, wisdom) chain proposed by Russell Ackoff can be used to predict is a philosophical fauxpas. He warned of such folly.

Yes, to gather data we already use precursor knowledge (the data scientist's) to do so. He in fact relies purely on assumptions. The data accuracy is not more than a guess. What you are measuring with the data is just how good the scientist's ASSUMPTIONS match the measurement process.

Data gathering DOES NEVER provide cause and effect chains. More data produce more noise. It is no more than correlation. The slightest change in the system will change how the data relate. Data cleansing and preparation do not improve the data but destroy them. Data are good to understand what happened, but not what will happen. Trends are large scale Gauss curve following patterns. They can't predict what will happen for one particular event and the complexity voids the rule that was inferred. Rule inferral only works in controlled lab environments!!!

Why is this the case? Mostly because there are NO RULES in nature but just complex interaction patterns. The formulae that we discover only work in simplified environment just as in classical physics but not for complex systems as there is no decomposition possible.

You point to the next problem that to utilize the predictive rule you need to have the same data model available for execution as the one you recorded. Prediction might offer a certain probability that from one data pattern you might step to another one when you take a certain action, but no one knows which other information is there that makes this no more than a Gauss curve distribution. You have no idea where on the slope you actually are and this how good the match might be. You need to multiply that uncertainty with the predictive rule probability. Plus you need to multiply the probability how accurate your assumptions and your measurement processes are.

So pattern matching is the only mechanism that actually can capture such repeating data-structures. It is only usable if the capture and execution environment are identical. Knowledge is not represented in terms of a rule, but only as patterns that are connected through a probably relevant action or event. If we see patterns, events and actions repeatedly we can assume some correlation.

As the context and the outside world changes by the day the predictive rule becomes less probable with every day it is not updated. This is what Peter Drucker said. The only way that knowledge stays relevant is to continuously relearn it. This is what we do with the ISIS Papyrus User-Trained Agent that continuously interprets and updates knowledge patterns. As it happens, it neither needs a knowledge engineer because it uses the business architecture definition and the application data models to gather patterns, events and actions. It is really strange that no-one understands that uniqueness and real-world practicality of that approach.

Predictive big-data analytics are utterly useless but everyone jumps on it ...

If You Don't Have Predictive Capability You Are At Disadvantage


Your main argument against predictive analytics seems to be that the real-world changes fast and that leads to predictive rules being obsolete as soon as they are discovered. It is true that "past performance is no guarantee of future results" in many domains such as economics, brand loyalty, stock performance, and others. But, I vehemently disagree that the pursuit of predictive models is "utterly useless" as you say in your conclusion.

1. Predictive rules do not have to be static. For example, KXEN runs a new analysis on data to update the model on a specified periodicity. Firms that do predictive analytics do not create one set of rules and then stick with that forever. Data scientists can constantly do discovery on data to create new models and optimize old ones.
2. Your platform may "continuously interpret and update knowledge patterns" but that approach is flawed in the same way you say predictive analytics can be - to have causitive data. The adaptive or predictive capabilities of a system are limited by causative data no matter how frequently you monitor it or how much "big data" you have.

Analytics is science. And, what is science? Constant discovery.


Predictive analytics principles vs. patterns

Hi Mike, thanks for the reply. Let me clarify. My main arguments are that correlation is not causation, that the past never predicts the future, that statistics of complex, mulit-dimensional state spaces are worthless without probability distribution, that the capture models are assumptions, the data are inaccurate and not time-aligned with events and actions and execution is flawed. Human decision making does not use causation but emotionally weighted patterns. That the real world changes continuously is just the icing on the cake.

One has to not only reanalyze the data, but also reanalyze the model and the causation assumptions. One has to analyze related human decision making, Then the capture model and the execution model have to be aligned. Then the changes to the new data introduced by the predictions have to be considered. You measure what you think you know, not what really happens. You might measure a correlation and thus be coaxed to believe that you are actually influencing the reality when all you measure is your actions.

With our approach we do not have causative data as we do not assume causation. We just correlate events, actions and patterns in the state space. If there is no repeating pattern we do not assume anything. If data elements are always the same or different they are irrelevant. If the users do not agree with the pattern match then it gets ignored. If there are actions and we don't see a repeating pattern then we do not have those data. We do not care why ...

We use the execution data model in the state space. If there is no pattern it means that the users are taking actions based on information that is outside the scope of the current state space. The user has the option to add data to the state space at any point in time. There is no causation assumption. The user says: 'I use this data to decide.' Not why and how. The agent then searches for patterns without the need for a data scientist.

We do not monitor more or less frequently. We check pattern matches when actions or events happen. We can sequence before and after action/event patterns into probable outcome trees. Nothing of the kind can be done by current predictive analytics as it is not real-time.

Hope that makes my comment more understandable. Regards, Max

Maybe it´s me...

Hi Max,

You seem to know what you are talking about, therefore I am confused by your statements... Of course no one who is professionally involved in data mining / predictive analytics will confuse between probability/correlation and causality. I would even say that the distinction is the whole justification for the existence of the profession! Clients and novices might mistake predictive models for crystal balls, but the professionals should explain.

No we cannot truly predict the outcome of an individual case/interaction, but this does not mean that predictive models are useless. Predictive models are used to support decision-making and to automate and optimize high volume operational ´decisions´. Not on the basis of certainty in each individual case, but on the basis of statistically significant patterns and ´the law of large numbers´. i.e. if I would be betting, I would put my money on the outcome that is statistically most likely. Extensive empirical evidence tells us that if I keep doing that I will win more (or loose less ;-) than if I go against this, or place bets at random.

However, in an individual case I might very well have a good reason to go against the model´s suggestion. I can make this judgment call if I a knowledgeable about the process and if I understand the assumptions and input that the model is based on. If I am right often enough, the model will learn this (new) pattern and it will be incorporated in a new and improved model. Continuously retraining predictive models is a prerequisite for a sound deployment strategy.
Periodically retraining the model is critical, but also is keep looking for new, potentially relevant data sources to incorporate in your training data set.

In the end, a well-build predictive model delivers not only very valuable support in decision-making, but a very important ´by product´ is the insight gained during the iterative process of developing and continuously improving the model.

Lastly, your statement quoted below is not completely true:
"We check pattern matches when actions or events happen. We can sequence before and after action/event patterns into probable outcome trees. Nothing of the kind can be done by current predictive analytics as it is not real-time."

These days we do have the technology to apply near real-time scoring of complex events using a combination of streaming data analytics, business rules and deployed predictive models. Application of this technology is increasingly more frequent, e.g. in customer interactions (churn prevention, X-sell / Up-sell), or Insurance claims processing.

Maybe I misunderstood your texts, any which way I find this an interesting discussion!

Regards, Mando

Statistics ... observations versus predictions

Mando, thanks for your extensive reply. I apologize, but I respectfully but seriously doubt your claim of real-time learning and prediction being used today.

To be precise, the Law of Large Numbers is ONLY related to Bernoulli trials of possible outcomes when all possible outcomes are known and Black Swans are assumed to be irrelevant. As we have seen in financial systems that assumption is simply wrong. Financial systems are complex adaptive systems and if at all then the Poisson probability distribution (or the Law of Small Numbers according to Bortkiewicz) must be used to come to any kind of sensible prediction such as customer complaints received per hour or the likely values of claims over a year. It relates to the probability distribution of events that happen rarely in a large opportunity space. Clearly, the insurance business is built on this principle. So I am not doubting this principle. Insurances survive nevertheless through large risk percentages, umbrella insurance and by spreading the bets across a number very different state spaces. With enough reserves, income can be adjusted to make up for the wrong predictions. It still is a betting game ...

The crash was caused by risk assessment of the various financial institutions who either did not use probability distributions at all or very different ones as they used different models. The Black-Scholes model of future value is one of these completely ridiculous models that must be seen as nothing but self-fulfilling prophecy. As you utilize the model to predict and act, you are changing the conditions that the model was built on in a completely unknown way. Trying to keep up with the changes that you are causing yourself is kind of silly, like a dog chasing its tail.

What I am doubting is the use of this concept to replace human judgment based on experience. Experience is powerful and fast emotional pattern matching and works extremely efficient in areas that humans can sensibly judge. It is an utter fallacy or assume that human judgment is improved by providing statistical probabilities based on some model that the decision makers do not understand. Rather than considering what a single customer really wants they are all considered to be a number a treated with the same bland average crappy service.

After I bought a camera, Amazon always tries to sell me more of the same rather than understanding that I already bought one. i might have bought it as a present and have no interest in photography at all. All that is ignored by those distributions. The result: crappy and still expensive service.

Let's leave it at that. Go on to try and automate humanity with statistical models and see where it leads us. My prediction? Nowhere nice ...

You have explored 4 pillars

You have explored 4 pillars of Predictive Analysis. Data, Data Scientists, PA SW, Operational App. How about 'data model'? PA SW Tools are only as good as the individual (DS or analyst) who can formulate the data model.

Predictive models exploit patterns found in historical and transactional data to identify risks and opportunities. You mine the past (rear-view), season it with the present / current information in order to look at the 'road ahead' and visualize scenarios. Models can capture complex relationships across data set, conditions, factors for risk or potential (oppty) assessment thus guiding decision making. Capability to create a model (based on mined data, current info, patterns identified and continuously 're-train' the model is another pillar for PA.

Shaloo Shalini

Exploit past patterns

Imagine this: You are driving a car and all you do is to look into the rear-view mirror and checking your speed in the dashboard. How far are you going to get? You will run off the road at the first turn. It further does not tell you anything about risks behind that turn.

You are closer with the (abstract) data model. Neither past nor current data tell you where the road is going and behind which turn a truck is blocking the road. Yes, we anticipate problems in an abstract way and it is our past abstract experience to say that a piece of road we can't see may hold a surprise and thus we go slow. To assume that it is pretty safe because in 97% of cases there is nothing blocking the road behind that turn, leads to the disasters we see in financial markets right now.

The only working models are human experience models in our heads. They are not perfect, but they are better than anything else because there is no perfect prediction in complex systems. We need to increase real-time perception in business, much like with a navigation system, traffic warnings, infrared sensors and systems such as ABS and collision warners.

Mike, Something people are

Something people are missing is the new opportunities big data is creating to monetize data that have been around and wasted for years. With the right means you can productize and monetize data... but what are you really selling? What is really giving value? The data? the information? the intelligence you can infer?
Your view on predictive analytics is for me a way of adding value to the data... Is it what a big data product is made of?

Useful Info...

Well done due, I found this very useful information; I love these best Predictive analytics software. I would foreword to all my dedicated friends of data analysis and Thanks for sharing.

Patricia Hall - Mobile Application Development India