Problems don’t care how you solve them. The only thing that matters is that you do indeed solve them, using any tools or approaches at your disposal.
When people speak of “Big Data,” they’re referring to problems that can best be addressed by amassing massive data sets and using advanced analytics to produce “Eureka!” moments. The issue of what approach — Hadoop cloud, enterprise data warehouse (EDW), or otherwise — gets us to those moments is secondary.
It’s no accident that Big Data mania has also stimulated a vogue in “data scientists.” Many of the core applications of Hadoop are scientific problems in linguistics, medicine, astronomy, genetics, psychology, physics, chemistry, mathematics, and artificial intelligence. In fact, Yahoo’s scientists not only had a predominant role in developing Hadoop but — as exploratory problem-solvers — they are active participants in Yahoo’s efforts to evolve Hadoop into an even more powerful scientific cloud platform.
The problems that are best suited to Hadoop and other Big Data platforms are scientific in nature. What they have in common is a need for analytical platforms and tools that can rapidly scale out to the petabyte level and support the following core features:
- Detailed, interactive, multivariate statistical analysis
- Aggregation, correlation, and analysis of historical and current data
- Modeling and simulation, what-if analysis, and forecasting of alternate future states
- Semantic mining of unstructured data, streaming information, and multimedia
Read more