Clean data is a Utopian concept

Blog Posts:Mu Sigma
Published On: 29 September 2011
Views: 112

There’s a germophobe in every crowd. It’s usually the person with the anti-bacterial lotion.

But did you know there’s usually a germophobe in every data center? That’s right. It’s the person always harping about “clean data.”

Yes, clean data is important. And it’s certainly preferable to dirty data. But some people take it too far. I’m here to tell you that clean data is a Utopian concept. Why, you say? Because once a database reaches a certain size, where a single owner and editor becomes impractical, it’s impossible to control data quality 100 percent. Much of the data is probably subject to frequent change. And to make matters worse, often it’s self-entered by your customers or prospects – all of whom are people (I presume). And people make mistakes.

Yes, it’s possible to clean up your data. There are ways to perform a data warehouse extreme makeover, with de-duping and lots of other tricks.

But don’t ever expect that your data warehouse will be 100 percent clean. Because, like my kitchen at home, the very second you complete the cleansing, it gets dirty again.

Thoughts? Does anyone believe they have a truly clean data warehouse? I’d love to hear from you – please post a comment below.

Back to Top