Enterprises agree that speedy deployment of big data Hadoop platforms has been critical to their success, especially as use cases expand and proliferate. However, deploying Hadoop systems is often difficult, especially when supporting complex workloads and dealing with hundreds of terabytes or petabytes of data. Architects need a considerable amount of time and effort to install, tune, and optimize Hadoop. Hadoop-optimized systems (aka appliances) make on-premises deployments virtually instant and blazing fast to boot. Unlike generic hardware infrastructure, Hadoop-optimized systems are preconfigured and integrated hardware and software components to deliver optimal performance and support various big data workloads. They also support one or many of the major distros such as Cloudera, Hortonworks, IBM BigInsights, and MapR. As a result, organizations spend less time installing, tuning, troubleshooting, patching, upgrading, and dealing with integration- and scale-related issues.
Choose From Among 8 Hadoop-Optimized Systems Vendors
Most enterprises aren't fully exploiting real-time streaming data that flows from IoT devices and mobile, web, and enterprise apps. Streaming analytics is essential for real-time insights and bringing real-time context to apps. Don't dismiss streaming analytics as a form of "traditional analytics" use for postmortem analysis. Far from it — streaming analytics analyzes data right now, when it can be analyzed and put to good use to make applications of all kinds (including IoT) contextual and smarter. Forrester defines streaming analytics as:
Software that can filter, aggregate, enrich, and analyze a high throughput of data from multiple, disparate live data sources and in any data format to identify simple and complex patterns to provide applications with context to detect opportune situations, automate immediate actions, and dynamically adapt.
Forrester Wave™: Big Data Streaming Analytics, Q1 2016
To help enterprises understand what commercial and open source options are available, Rowan Curran and I evaluated 15 streaming analytics vendors using Forrester's Wave methodology. Forrester clients can read the full report to understand the market category and see the detailed criteria, scores, and ranking of the vendors. Here is a summary of the 15 vendors solutions we evaluated listed in alphabetical order:
Most apps are dead boring. Sensors can help add some zing. Sensors are data collectors that measure physical properties of the real-world such as location, pressure, humidity, touch, voice, and much more. You can find sensors just about anywhere these days, most obviously in mobile devices that have accelerometers, GPS, microphones, and more. There is also the Internet of Things (IoT) that refers to the proliferation of Internet connected and accessible sensors expanding into every corner of humanity. But, most applications barely use them to the fullest extent possible. Data from sensors can help make your apps predictive to impress customers, make workers more efficient, and boost your career as an application developer.
Hadoop’s momentum is unstoppable as its open source roots grow wildly into enterprises. Its refreshingly unique approach to data management is transforming how companies store, process, analyze, and share big data. Forrester believes that Hadoop will become must-have infrastructure for large enterprises. If you have lots of data, there is a sweet spot for Hadoop in your organization. Here are five reasons firms should adopt Hadoop today:
Build a data lake with the Hadoop file system (HDFS). Firms leave potentially valuable data on the cutting-room floor. A core component of Hadoop is its distributed file system, which can store huge files and many files to scale linearly across three, 10, or 1,000 commodity nodes. Firms can use Hadoop data lakes to break down data silos across the enterprise and commingle data from CRM, ERP, clickstreams, system logs, mobile GPS, and just about any other structured or unstructured data that might contain previously undiscovered insights. Why limit yourself to wading in multiple kiddie pools when you can dive for treasure chests at the bottom of the data lake?
Enjoy cheap, quick processing with MapReduce. You’ve poured all of your data into the lake — now you have to process it. Hadoop MapReduce is a distributed data processing framework that brings the processing to the data in a highly parallel fashion to process and analyze data. Instead of serially reading data from files, MapReduce pushes the processing out to the individual Hadoop nodes where the data resides. The result: Large amounts of data can be processed in parallel in minutes or hours rather than in days. Now you know why Hadoop’s origins stem from monstrous data processing use cases at Google and Yahoo.
The Obama 2012 campaign famously used big data predictive analytics to influence individual voters. They hired more than 50 analytics experts, including data scientists, to predict which voters will be positively persuaded by political campaign contact such as a call, door knock, flyer, or TV ad. Uplift modeling (aka persuasion modeling) is one of the hottest forms of predictive analytics, for obvious reasons — most organizations wish to persuade people to to do something such as buy! In this special episode of Forrester TechnoPolitics, Mike interviews Eric Siegel, Ph.D., author of Predictive Analytics, to find out: 1) What exactly is uplift modeling? and 2) How did the Obama 2012 campaign use it to persuade voters? (< 4 minutes)
William Shakespeare wrote that “What’s past is prologue.” Big data surely builds on our rich past of using data to understand our world, our customers, and ourselves. Now the world is flush and getting flusher in big data from cloud, mobile, and the Internet of things. What does it mean for enterprises? In a word: opportunity. Firms have taken to big data. Here are my four predictions for key enterprise big data themes in 2013:
Firms will realize that “big data” means all of their data. Big data is the frontier of a firm’s ability to store, process, and access (SPA) all of the data it needs to operate effectively, make decisions, reduce risks, and create better customer experiences. The key word in the definition of big data is frontier. Many think that big data is only about data stored in Hadoop. Not true. Big data is not defined by how it is stored. It can and will continue to reside in all kinds of data architectures, including enterprise data warehouses, application databases, file systems, cloud storage, Hadoop, and others. By the way, some predict the end of the data warehouse — but that’s nonsense. If anything, all forms of data technology will evolve and be necessary to handle the frontier of big data. In 2013, all data is big data.
Rowan Curran, Research Associate and TechnoPolitics producer, hosts this episode to ask me (your regular host) about The Pragmatic Definition Of Big Data. Listen (5 mins) to hear the genesis of this new definition of big data and why it is pragmatic and actionable for both business and IT professionals.
Podcast: The Pragmatic Definition Of Big Data Explained (5 mins)
A key role of IT operations is to keep a complex portfolio of applications running and performing. "Traditional monitoring dashboards generate lots of pretty charts and graphs but don't really tell IT operations professionals a whole lot," says Forrester Principal Analyst Glenn O'Donnell. Big data analytics will change that because sophisticated algorithms can "look for the little tremors that tell us something big is about to happen."
High Availability And Performance Are Top Goals For IT Ops
Asked what 5 nines (99.999%) of availability means, Glenn replies immediately, "5 nines of availability is 26 seconds of downtime per month." He adds "If you want to capture just one 26 second event, you have to be polling every 13 seconds." Glenn knows his stuff. Listen to find out from Glenn how big data has a big place in the future of IT operations.
Every year the Center For Digital Strategies at Tuck chooses a technology topic to "provide MBA candidates and the Tuck and Darthmouth communities with insights into how changes in technology affect individuals, impact enterprises and reshape industries." This academic year the topic is "Big Data: The Information Explosion That Will Reshape Our World". I had the honor and privilege to kick off the series about big data at the Tuck School of Business at Dartmouth. I am thrilled that our future business leaders are considering how big data can help companies, communities, and government make smarter decisions and provide better customer experiences. The combination of big data and predictive analytics is already changing the world. Below is the edited video of my talk on big data predictive analytics at Tuck in Hanover, NH.