Whenever I think about big data, I can't help but think of beer – I have Dr. Eric Brewer to thank for that. Let me explain.
I've been doing a lot of big data inquiries and advisory consulting recently. For the most part, folks are just trying to figure out what it is. As I said in a previous post, the name is a misnomer – it is not just about big volume. In my upcoming report for CIOs, Expand Your Digital Horizon With Big Data, Boris Evelson and I present a definition of big data:
Big data: techniques and technologies that make handling data at extreme scale economical.
You may be less than impressed with the overly simplistic definition, but there is more than meets the eye. In the figure, Boris and I illustrate the four V's of extreme scale:
The point of this graphic is that if you just have high volume or velocity, then big data may not be appropriate. As characteristics accumulate, however, big data becomes attractive by way of cost. The two main drivers are volume and velocity, while variety and variability shift the curve. In other words, extreme scale is more economical, and more economical means more people do it, leading to more solutions, etc.
So what does this have to do with beer? I've given my four V's spiel to lots of people, but a few aren't satisfied, so I've been resorting to the CAP Theorem, which Dr. Brewer presented at conference back in 2000. I'll let you read the link for the details, but the theorem (proven by MIT) goes something like this: