Joining in on the spirit of all the 2013 predictions, it seems that we shouldn't leave data quality out of the mix. Data quality may not be as sexy as big data has been this past year. The technology is mature and reliable. The concept easy to understand. It is also one of the few areas in data management that has a recognized and adopted framework to measure success. (Read Malcolm Chisholm's blog on data quality dimensions) However, maturity shouldn't create complancency. Data quality still matters, a lot.
Yet, judgement day is here and data quality is at a cross roads. It's maturity in both technology and practice is steeped in an old way of thinking about and managing data. Data quality technology is firmly seated in the world of data warehousing and ETL. While still a significant portion of an enterprise data managment landscape, the adoption and use in business critical applications and processes of in-memory, Hadoop, data virtualization, streams, etc means that more and more data is bypassing the traditional platform.
The options to manage data quality are expanding, but not necessarily in a way that ensures that data can be trusted or complies with data policies. Where data quality tools have provided value is in the ability to have a workbench to centrally monitor, create and manage data quality processes and rules. They created sanity where ETL spaghetti created chaos and uncertainty. Today, this value proposition has diminished as data virtualization, Hadoop processes, and data appliances create and persist new data quality silos. To this, these data quality silos often do not have the monitoring and measurement to govern data. In the end, do we have data quality? Or, are we back where we started from?