5 Reasons Hadoop Is Kicking Can And Taking Names

Hadoop’s momentum is unstoppable as its open source roots grow wildly into enterprises. Its refreshingly unique approach to data management is transforming how companies store, process, analyze, and share big data. Forrester believes that Hadoop will become must-have infrastructure for large enterprises. If you have lots of data, there is a sweet spot for Hadoop in your organization.  Here are five reasons firms should adopt Hadoop today:

  1. Build a data lake with the Hadoop file system (HDFS). Firms leave potentially valuable data on the cutting-room floor. A core component of Hadoop is its distributed file system, which can store huge files and many files to scale linearly across three, 10, or 1,000 commodity nodes. Firms can use Hadoop data lakes to break down data silos across the enterprise and commingle data from CRM, ERP, clickstreams, system logs, mobile GPS, and just about any other structured or unstructured data that might contain previously undiscovered insights. Why limit yourself to wading in multiple kiddie pools when you can dive for treasure chests at the bottom of the data lake?
  2. Enjoy cheap, quick processing with MapReduce. You’ve poured all of your data into the lake — now you have to process it. Hadoop MapReduce is a distributed data processing framework that brings the processing to the data in a highly parallel fashion to process and analyze data. Instead of serially reading data from files, MapReduce pushes the processing out to the individual Hadoop nodes where the data resides. The result: Large amounts of data can be processed in parallel in minutes or hours rather than in days. Now you know why Hadoop’s origins stem from monstrous data processing use cases at Google and Yahoo.
  3. Data scientists can wrangle big data faster. Data scientists can find success when they run algorithms on massive amounts of data instead of on much smaller samples. Hadoop’s HDFS combined with MapReduce make it an ideal platform to run advanced analytics such as machine learning algorithms to find predictive models. There is even an Apache project called Mahout that offers a growing library of algorithms that are optimized to run on Hadoop.
  4. Even the POC can make you money. Forrester has talked with many early adopters of Hadoop who use terms like “wildly successful” to describe the results of their Hadoop proof of concept (POC). Hadoop’s applicability is not limited to specific industries. Financial institutions, government, manufacturing, oil exploration, eCommerce, media all have lots of data — big data. POC’s use cases range from offloading traditional business intelligence workloads from a data warehouse to Hadoop to using advanced analytics in the data lake to predict customer behavior, find patterns in unstructured text, and decode the human genome.
  5. The future of Hadoop is real-time and transactional. Hadoop is immature compared to established data management technologies, but it is a lot more mature than you think. The Hadoop open source community and commercial vendors are innovating like gangbusters to make Hadoop an enterprise staple. On October 15, Apache released Hadoop 2.x with a hearty list of new features such as YARN, which improves Hadoop’s processing efficiency and workload flexibility. In the meantime, the key commercial vendors are focusing on fast SQL access, real-time streaming, and manageability features that enterprises demand. The groundwork is being laid for an eruption in data management technologies as Hadoop sneaks its way into the transactional database market. Your adoption of Hadoop now, for analytical processing, will ensure that you are ready.

Stay tuned for the Forrester Wave™ of enterprise Hadoop solutions (expected publication: January 2014), in which we evaluate Hadoop solutions from 10 leading commercial vendors.

This blog post was coauthored with David Murphy, Research Associate.

You may also be interested in:

What Is Hadoop?

What Is A Data Scientist?

Predictive Apps Are The Next Big Thing In Application Development

Comments

Great stuff, Mike. I love

Great stuff, Mike. I love hearing about strong POC's, because it really is very encouraging. The folks that "get it" are reaping the benefits, while the pessimistic lag behind. And as we continue to improve our tools, it's more important to understand the use cases and best practices conceptually. The tools will take care of the rest.

Post new comment

If you have an account on Forrester.com, please login.

Or complete the information below to post a comment.

(Your name will appear next to your comment.)
(We will not display your email.)
Email me when:
Type the characters you see in this picture. (verify using audio)
Type the characters you see in the picture above; if you can't read them, submit the form and a new image will be generated. Not case sensitive.