Not all that long ago, thin provisioning became a feature that storage vendors were tripping over themselves to offer in their products. Since then, it has become a de facto standard that you'll find in just about any virtualized storage array worth its salt.
Unfortunately, the huge potential of thin provisioning is often overshadowed by the complications of using it in production. That's a shame.
At its simplest, thin provisioning grants more storage to a given consumer than you are actually reserving or allocating. This can be done on a SAN, but you'll also see it in various virtualization hypervisors, and you can potentially do both at the same time.
Let's say you have a virtualized file server that you've configured with a thin-provisioned 500GB virtual disk. That 500GB virtual disk is in turn sitting on a 1TB SAN volume that has also been thin provisioned on the storage array. The theory is that you'll merely use as much space on your SAN as has actually been written into that 500GB virtual disk. If there's 250GB of data on the file server, then only that much needs to be consumed on the SAN -- significantly better than having that entire 1TB tied up.
It's easy to see how thin provisioning can save you a bundle just by increasing utilization of expensive SAN storage. Instead of leaving that remaining 750GB of capacity stranded and unused, other consumers can make use of it -- or you could avoid purchasing in the first place.
Worse, many file systems won't advantageously overwrite deleted data blocks, instead opting to use "fresh" space on the disk. If our file server sees a lot of data turnover with many writes and subsequent deletes (maybe an archiving system is being used), its thin-provisioned disk will quickly grow to the full 500GB even though the file system may still appear to have 250GB of free space. Even if you use a tool such as Microsoft's Sysinternals SDelete to overwrite all unallocated space with zeros, many SANs aren't smart enough to know that they can free the data and will instead allocate the entire disk.
To compound the situation even further, let's say that our virtualized file server is being backed up by virtual machine-aware software (such as vRanger, Veeam, or esXpress). All of these tools make use of virtual machine snapshots to isolate the virtual machine's disk so that a consistent copy of it can be made. When you create a virtual machine snapshot, you're essentially telling the hypervisor to shunt subsequent disk writes into a separate snapshot file on the SAN volume. When you delete the snapshot, the hypervisor copies that snapshot data back into the main disk and deletes the snapshot file.
If the backup jobs take a significant amount of time to run, there may be enough turnover on the volumes during the backups that the snapshot file will grow substantially before the backup job has completed and it gets deleted. From the SAN's perspective, all of that space required to store that snapshot also needs to get allocated and generally won't be freed afterward. Over time, our thin-provisioned 500GB volume with only 250GB of data in it could actually have caused well more than 500GB be allocated on the SAN.
This isn't terribly better than just skipping thin provisioning completely and fully allocating things from the get-go. If it's going to require that much more time to manage properly and doesn't really save you that much, what's the point?
Exactly -- today, there generally isn't much point in using thin provisioning on production servers in most SAN environments. Given the huge potential to save big bucks on storage hardware if it all worked properly, this is a terrible waste. But fortunately, there's also some light at the end of the tunnel.