Free Newsletters
Technology & Business Daily

InfoWorld
Log-in | Register

The perils of dirty data

How important is data cleansing and validation? Read these tales of horror, and beware


What constitutes the “right level” will vary depending on how you use the data. “In the direct mail industry, getting 70 to 80 percent of your data correct is probably good enough,” he adds. “In the pharmaceutical industry, you want to be at 99 percent or better. But no company really wants, needs, or will pay for perfect data; it's just too expensive. The issue always is, how will it be used and at what point is it good enough?”

2. Dead men cast no votes
Data cleansing can be a matter of life and death -- literally. PR specialist Nancy Kirk was volunteering in the congressional elections of 2006, calling registered voters to get them to the polls, when she noticed something odd: Three out of ten voters she dialed were deceased and thus ineligible to vote (except in certain precincts in Chicago).

The problem of having data that is literally dead is not uncommon in the commercial world, and it has real consequences for the living.

Jim Keyser, president of The Keane Organization's Investor Retention and Communication Solutions division, has spent the past year rolling out an investor data quality program for Keane's clients, which include major insurance companies, mutual funds, and Fortune 500 firms.

Keyser says they often find 8 to 15 percent of clients' data records contain anomalies such as mistyped Social Security numbers or outdated addresses. But about one in five of those anomalies is a shareholder who's been dead for more than five years. In one case, a client had an “active” account for a shareholder who last drew breath more than 72 years ago.

“This isn't client negligence, it's just a naturally occurring problem,” Keyser says. Private companies go public, change names, get acquired, or spun off, and their shareholder data follows along, often for decades.

But the consequences can be greater than just money wasted on unnecessary mail. The biggest concerns are fraud and identity theft. Some stranger could be cashing the late shareholder's dividend checks, the rightful heirs could be denied their inheritance, or confidential company info could leak out.

The solution? Software such as Keane's Score application can identify data anomalies across different systems and flag them for review. But all companies must exercise due diligence, have good internal controls, and scrutinize their data on a regular basis, says Keyser.

“Virtually every business has this problem to some degree,” he says. “From a risk management point of view, the best practice is to make sure you're keeping it in check. Understanding how this natural phenomenon impacts you is a good first step.”

3. Duped by duplicates
User error is bad. User ingenuity can be worse. Take the case of the major insurance carrier that kept most of its customer data within a mainframe application from the 1970s. Data entry operators were instructed to first search the database for existing records before entering new ones, but the search function was so slow and inaccurate that most operators gave up and entered the records from scratch.

Dan Tynan is contributing editor at InfoWorld.
Continued
« PREVIOUS PAGE | 1 | 2 | 3 | 4 | NEXT PAGE » 


Talkback:

commentPost a Comment

 

MOST COMMENTS

 
 





Do you have the power to resolve technical issues with one call?
Watch this webcast to get an under-the-hood look at a remote support solution that enables the IT organization to be the engine that keeps your end users productive and your company running.

»  Click here to view this Webcast
  Planning For A Disaster
This new, comprehensive Solutions Guide is your one stop source for Disaster Recovery. In it you'll learn how to reduce the likelihood of a disaster and to create a rock solid business continuity plan should you face a disaster situation. Sponsored by Equallogic

»  Click here to download now

- Special Advertising Partners -
WHITE PAPERS
 

» Technology White Papers Library

Technology White Papers by Topic

Technology White Papers E-mail Alert

Find out when the latest white paper is available:
 
 
INFOWORLD MARKETPLACE
 
» BUY A LINK NOW
 
 

Video

 
 
 

Podcasts

 
IFW Daily 08/29/2008

Microsoft will focus on performance issues in Windows 7 and IE8, Qualcomm...

 
 

 

Columnists

 
 
 

Resource Center


Ads by techwords beta  [See your link here]
 




Sponsored Technology Links

 
 
 HOME  NEWS  BLOGS  PODCASTS  VIDEOS  TECHNOLOGIES  TEST CENTER  EVENTS  CAREERS   About | Advertise | Awards | RSS | Contact Us 

Copyright © 2008, Reprints, Permissions, Licensing, IDG Network, Privacy Policy, Terms of Service.
All Rights reserved. InfoWorld is a leading publisher of technology information and product reviews on topics including viruses,
phishing, worms, firewalls, security, servers, storage, networking, wireless, databases, and web services.

CIO :: ComputerWorld :: CSO :: Demo :: GamePro :: Games.net :: IDG Connect :: IDG World Expo
Industry Standard :: IT World :: JavaWorld :: LinuxWorld :: MacUser :: Macworld :: Network World :: PC World :: Playlist