Data Quality Reboot Series For Big Data: Part 3 Risky Data, Risky Business?

When you last pulled up a chair to this blog we talked about data quality persistence and disposability for big data. The other side of the coin is, should you even do big data quality at all?

So, this blog is dedicated to stepping outside the comfort zone once again and into the world of chaos. Not only may you not want to persist in your data quality transformations, but you may not want to cleanse the data.

Current thinking: Purge poor data from your environment. Put the word “risk” in the same sentence as data quality and watch the hackles go up on data quality professionals. It is like using salt in your coffee instead of sugar. However, the biggest challenge I see many data quality professionals face is getting lost in all the data due to the fact that they need to remove risk to the business caused by bad data. In the world of big data, clearly you are not going to be able to cleanse all that data. A best practice is to identify critical data elements that have the most impact on the business and focus efforts there. Problem solved.

Not so fast. Even scoping the data quality effort may not be the right way to go. The time and effort it takes as well as the accessibility of the data may not meet business needs to get information quickly. The business has decided to take the risk, focusing on direction rather than precision.

Read more