How to Repair Your Data - Thomas C. Redman - Harvard Business Review

No matter what, do not underestimate the data quality problem, nor the effort required to solve it. You must get in front of data quality.

Data warehousing is hard.  To build a model that works for the business users and have data quality that truly delivers "one version of the truth" takes dedication and a group that truly understands the business.

Address preexisting issues.

 There are some problems that have been created already, and you have no choice but to address these before you use the data in any serious way. This is time-consuming, expensive, and demanding work. You must make sure you understand the provenance of all data, what they truly mean, and how good they are. In parallel, you must clean the data.

We are currently in the process of doing this in my organization.  In fact, we are going to rebuild the entire data model.  Sometimes it's easier to start from scratch instead of figuring out what is wrong with the current model.  Of course, our model isn't that wonderful for the business, so this made the rebuild decision quite easy.

Prevent the problems that haven't happened yet.
...build controls (such as calibrating test equipment) into data collection; identify and eliminate the root causes of error;

Data warehousing efforts also fail because end users find the errors most of the time.  When this occurs, getting the organization to trust the data becomes a challenge.  There is always the questioning of if this data is right.  Proactively fix data and let end users trust the data, they will spend more time discussing strategy instead of fighting over data quality.

 

Source: http://blogs.hbr.org/cs/2012/09/how_to_rep...