The performance of primary storage is more likely to affect the performance of your applications than the network, server CPUs, or the size and speed of server memory. That's because storage tends to be the biggest performance bottleneck, especially for applications that rely heavily on a large databases.
That makes testing crucial. Before you buy, you need to know how well your applications perform on the specific storage hardware you're eyeing. As I noted last week, most vendors will provide a loaner for you to test-drive.
Unfortunately, testing storage is not always a straightforward process. It requires a solid understanding of how your applications use storage and how the storage device you're evaluating functions under the hood. Each situation is different; no single test or benchmark can give everyone the answer they're looking for, but you can take some basic evaluative steps to ensure your storage is up to the task.
Knowing what to test
The tests you run on your prospective storage hardware will largely depend upon what you're doing with it. Someone in search of storage for a video editing suite, for example, will have drastically different storage needs than someone who runs a large enterprise database. These tests fall into two familiar categories: throughput and random seek.
Raw data throughput is simply the amount of data you can move on or off storage hardware in a given period of time, usually expressed in MBps. Unfortunately for most enterprise applications, this figure is relatively meaningless. It also happens to be the most frequently quoted performance metric for marketing purposes, as well as the easiest to test and the easiest for hardware to excel at. It's no wonder throughput numbers lead to misconceptions about storage performance.
High levels of raw throughput accelerate the transfer of very large files, but most applications rarely -- if ever -- incur this kind of disk workload. Nonetheless, raw throughput tests can be very useful for validating the implementation of your storage network, whether Fibre Channel or iSCSI, though it does little to stress the disk subsystem itself.
Almost always, the more important metric is the number of small, randomized disk transactions that can be completed in a given period of time, expressed in I/O operations per second or IOPS. This is the kind of workload most databases and email servers put on your disk. Poor transactional disk performance is the usual underlying reason for poor databases performance, second only to poorly conceived application and database design -- but that's another story. The number of disk transactions that can be completed per second is a function of latency -- that is, the time a disk resource requires to serve each request.
Drilling into transactional performance
The type of disk you use and the intelligence of the array determine the transactional disk performance you can expect. The small and randomized nature of these workloads put stress on storage hardware, because read/write heads run into physical limitations as they jump all over spinning mechanical disks.