Signs of a deluge: Brace for stringent storage requirements

The new administration's policies, if implemented, will affect storage and storage management in several areas

In a recent speech, President Obama called for all U.S. residents to have electronic health records within five years. If implemented, this would affect storage and storage management in several areas, including increasing demand for storage at medical offices from small to large, converting existing data on paper, ensuring the privacy of the data, as well as that standard data formats allow sharing of data between providers, and increasing requirements for records retention. While many of these issues are potential fodder for this column, I'll start with retention.

Data retention software typically has a policy engine that lets you define one or more storage retention policies and an archiving process that takes the data and applies the retention policy, keeping the data in its original state for the specified period, then deleting it. Many data retention and archiving systems offer flexible policies and write-once, ready-many (WORM) archives that retain the data for the stated period. They may also offer auditing programs to show what data has been accessed, and by which users.

[ What should Obama's tech strategy be? InfoWorld's Galen Gruman proposes a national tech agenda. | Where could tech spending do the most good? Three experts share their ideas. ]

However, medical records can create some specific challenges. While holding onto corporate e-mails for three to seven years, then deleting them is standard, medical records often need to be retained for the life of the patient -- or longer if the patient has children. This involves not only making the data as compact as possible so that the overall record size doesn't spiral out of control (digital X-rays and MRI data can be sizable), but ensuring that the data remains readable for 50 years or more.

The only media we have currently that's been around 50 years is paper (and film). Even data from 25 years ago can be difficult to read -- TIFF files might be readable, but try opening a XYWrite file from 1984 or a PC File database. How the files are stored becomes a big issue too -- if you had a 5.25-inch floppy with XYWrite files on it, how would you read them? Backup tapes from 1984, even if the data is readable at all, would be in the proprietary format of the backup application they were written with -- got a copy of that?

Some of these issues can be solved by not taking the data offline -- leave it in second- or third-tier storage. Do your best to ensure that formats are open and run a refresh every 10 years or so, opening files and saving the data in newer formats if necessary. There are other issues, such as encryption and hashes to ensure data hasn't been modified that become problematic over time as well -- hashes that are secure now will be easy to crack in 20 years, so how do you prove that data hasn't been accessed or modified? What this all boils down to is that it won't be possible to put a system in place and leave it, like you would with a filing cabinet. It'll take regular review and oversight by storage administrators to ensure that data is kept viable, secure and accessible.