5. The war on error
The difference between good data and bad can be as small as a single dot. Penny Quirk, principal consulting manager at Robbins-Gioia's Records and Information Optimization Practice Area, says she once consulted on a major data integration project where everything seemed to go fine. Six months later someone opened a data table and found rows of symbols but no data.
“It was a character coding error,” says Quirk. “They used ellipses in some fields, and wherever someone had entered two dots instead of three it triggered the whole line of data to go corrupt.”
The company had to painstakingly re-create the database from a backup, searching for the ellipses, then replacing them with the actual data.
More often than not, the problem is more than mere data entry errors or garbage in/garbage out. Most organizations fail to adequately plan when moving data between different operating systems or upgrading from older versions of SQL, says Quirk. They'll do it too quickly, using whatever resources are available now with the hope of cleaning it up later. (A bad idea, she adds.) Worse, their test environments and production environments may not match, or they may test using a small subset of data, only to have big problems arise later with the data they didn't test.
“Organizations making dramatic changes in technology without putting forth the necessary time and effort to manage the data reconciliation, integration, and conversions can become victims of bad data,” Quirk says. “As data is moved from one source to another, the number of chances for it to become bad is astronomical.”
Quirk's advice? Don't expect IT departments to validate your data set. Get the power users who work with the data to help plan and test the integration. Before you decide to consolidate, look at all your data fields and identify the applications that may be pulling data from them. When possible, test with all your data, not just a subset because even the tiniest errors can send you and your data into a world of pain.
One final horror story illustrates just how big a small error can become.
Peter Teuten, president and CTO of Keane Business Risk Management Solutions, tells of a client that created an application to determine whether corrupt files were circulating in their systems. If the amount of corrupted data exceeded a certain threshold, the company would know to implement data protection processes.
The problem? They accidentally inverted the rule set for the data protection system; the more corrupt data it found, the better their systems appeared in the reports.
“The network was eventually infiltrated by a worm, which corrupted their files,” says Teuten. “They had to rebuild most of them from scratch, which cost them millions of dollars. All from a very simple configuration and management error -- two numbers were reversed.”
If that doesn't scare you into approaching your next data management project with caution, nothing will.