In my post last week, I described some of the basic challenges in ensuring that data you delete actually stays deleted. In the context of personal computers and removable drives, these concepts can be confusing for users but are usually fairly well understood by IT pros. But IT pros are often confused when deleting data in the context of storage virtualization in their data centers.
Virtualizing storage has been enormously popular for several years. It's no wonder, either: By abstracting the underlying storage medium from how it's presented to storage users, you can pull off really cool tricks. Thin provisioning, snapshots, SSD wear-leveling, and automated storage tiering are all possible thanks to storage virtualization.
However, all this progress has come at a cost to data security. You can no longer simply overwrite a disk with random garbage and assume that anything that had been on that disk has been effectively obscured, as you can on your PC. Instead, there are almost certainly leftover bits and pieces of that data floating around on your storage device.
If you want to be reasonably sure that someone won't come across sensitive data by accident, you can succeed without too much difficulty. But if you're looking for an iron-clad guarantee that sensitive data will never see the light of day, you'll find it can get substantially more complicated and in fact almost impossible without committing to a mammoth undertaking.
Imagine you're in IT at a medium-size accounting firm. Your data center infrastructure consists of a few VMware vSphere virtualization hosts coupled with a Dell EqualLogic SAN. You use Veeam's Backup and Replication to back all that up daily to an ExaGrid NAS and monthly onto tape monthly archives. Maybe you use products from Citrix Systems, Hewlett-Packard, Microsoft, and/or NetApp -- it doesn't matter, as the issues are the same in this common storage scenario no matter what products you use.
The SAN. You're still not done. Now you have the SAN to worry about. In this case, the EqualLogic SAN is charged with storing the VMFS volume that you just painstakingly cleaned. Because you're concerned with maintaining uptime and being able to support the partners with fast restores when something is deleted by mistake, you've configured periodic snapshots to be taken of the volumes. As with VMware snapshots, the EqualLogic SAN uses free pool space to store blocks of data as they are changed -- marking the old blocks as part of the most recent snapshot.
Overwriting the active disk with zeroes has only obfuscated one version of the disk. There are still many versions of that disk (and your target files) sitting on the SAN in perfectly good shape, ready to be restored at a moment's notice. As with the VMware snapshots, you can't just delete the snapshots because doing so won't actually overwrite the blocks on the disk. Plus, if you did overwrite those blocks, you might break your disaster-recovery SLAs by hobbling your ability to restore quickly.
It gets even worse. Even if you find a way to delete the snapshots and clean all the free space on the SAN, you're still not done. Now you have to track down all the backups you've made since that data was completed and securely delete them too. That could mean destroying tapes and trying to securely delete data from the ExaGrid appliance.
Even assuming you could find ways to do all this, you'd be left having spent countless hours performing secure deletions in the file server's file system, the hypervisor's file system, and the SAN's file system. You also would have to delete every SAN snapshot you've ever taken of that volume (which might encompass a large number of other VMs) and destroyed nearly every backup you've ever made -- all just to ensure a handful of files were well and truly deleted beyond a shadow of a doubt.
There is one method that works if you start with it
Of course, people very, very rarely go through this kind of trouble to delete data. However, that doesn't keep them from promising they've done so -- a promise usually made out of simple ignorance. The cold, hard truth is that it is nearly impossible (and certainly impractical) to securely delete unencrypted data from an environment that uses multiple layers of storage virtualization without physically destroying the environment -- that is, physically shredding drives.