Big data is viewed as a very good thing by most enterprises. With the right analytics, it can generate meaning and business value. But like with many things there can be too much of a good thing, say a number of Information Governance (IG) experts.
Their message is that enterprises need to do more than protect their data from theft or infection -- they need to get rid of some of it, for both economic and legal reasons.
[ Prevent corporate data leaks with Roger Grimes' "data Loss Prevention Deep Dive" PDF expert guide, only from InfoWorld. | Stay up to date on the latest security developments with InfoWorld's Security Central newsletter. ]
Dumping data has a variety of names, so far, including defensible disposition, defensible deletion and active expiration. Barry Murphy, cofounder and principal analyst at eDJ Group, prefers defensible deletion (DD).
What is more important than the label, Murphy wrote in a post in eDiscovery Journal, however: "Companies can reduce costs and decrease risks by proactively getting rid of unnecessary information."
Murphy told CSO Online that it is true that the cost of storage, both on-premise and in the cloud, continues to decrease. "One could argue that the decreasing cost of storage combined with lower-cost information processing platforms like Hadoop makes keeping information in perpetuity economically viable," he said. "But the rate at which information grows is faster than the rate at which the cost of storage decreases. So much corporate information is either duplicate or unnecessary that the cost of retaining it is greater than that of getting rid of it."
Jim McGann, vice president of marketing for Index Engines, said in an interview with Government Technology last year that in the past five years he had seen organizations taking steps to "clean up the 'data lake' that has been generated."
The motivation is legal as well as economic, he said. Until about 15 years ago, organizations could save anything and easily hide the content that could become a liability, but he said that won't work these days. "Lawyers and judges are more tech savvy and they won't accept excuses about complexity and cost issues anymore," he said.
Barry Murphy agrees. "The cost and risk of eDiscovery can poke a giant hole in any economic assessment of information management costs," he said.
The rules governing electronic information are different than those for paper documents, since it usually includes metadata, which can be important as evidence. An example is the value of the date and time a document was written to a copyright case.