Choosing a primary storage solution for your organization can be a complicated task. Perhaps the most important thing you can do to ensure that you end up with a well-designed and cost-effective solution is to have a solid understanding of the storage needs of your environment -- both present and future. If you fail to do this before you start to review the myriad storage solutions on the market today, you'll waste a lot of time and may end up with a solution that doesn't fit your needs.
Unless you are intentionally buying storage that will be dedicated to a single high-performance system, the best place to start is to look at literally every storage user in your environment. Highly redundant, high-performance primary storage is not cheap. So the best way to leverage the investment you're going to make is to ensure that it delivers the maximum benefit to largest percentage of your infrastructure as possible.
Here are some quick rules to follow and common pitfalls to avoid in the key areas of consideration.
Your current storage capacity requirements are probably the easiest thing to determine. Look at the disk space used by all of your servers, add that up, and voilà -- you have a number. But of course, it's not that simple. There are three core factors to consider that can cause your storage estimate to be inaccurate.
The third major capacity factor to consider is any planned usage of snapshots in the storage environment. Various storage vendors have implemented dramatically different snapshot technologies, so this is something you will need to revisit as you consider specific products. The amount of space that snapshots will use is typically related to the rate of data change on your storage volumes. Note that this not the same as data growth -- data can and does frequently change without growing. Databases and e-mail servers are great examples of this type of turnover. A quick rule of thumb is to set aside between 50 and 100 percent of the actual size of your data for snapshots. How often you want to take snapshots, how long you want to keep them, the specific snapshot technology in use, and your data change rate will significantly affect this calculation.
Judging the amount of performance that your disk architecture will need requires a good understanding of the various metrics used to judge disk performance. This topic can become extremely complex the deeper you delve into it, but there are some general rules that will usually allow you to make an accurate estimate of your needs.
Before you begin to consider the storage performance needs of your applications, the first thing you can do is to forget how important they are. That may sound ridiculous, but I have seen time and time again that organizations will unconsciously assume that because an application is "important," it requires more performance. Or, worse, because an application is "unimportant," they assume it will require less.
In the first case, you risk buying disk resources that you don't need. In the second case, you may end up with a poorly performing application. Important or not, nobody likes either of these situations. An application's relative importance within your organization may lead you to increase the amount of performance headroom you leave it to ensure you have room for unexpected demand spikes, but it shouldn't influence your initial analysis.
While it's certainly important to ensure that your chosen storage platform can meet these needs, focusing only upon maximum data throughput misses one of the most critical aspects of disk performance.
"Transactional throughput" refers to the total number of small, randomized disk transactions that your disk architecture can carry out in a given amount of time and is usually reflected in I/O per second (IOPS). In short, transactional performance is what will generally make or break your storage environment, not how much raw data you can move.
Most critical applications are based on structured data systems such as databases. These types of systems generally produce a significant amount of very small data transactions that are very rarely sequential. As such, your storage system's memory-based cache is not likely to be particularly effective, nor will you be using the bandwidth available in your storage interconnect to a great degree. The key factor in providing transactional throughput is the number and type of disks you have in your storage system. In these cases, the amount of time that it takes a drive head to seek data on one part of a disk platter and then skip over to the other side and get another piece makes a much larger difference than whether you have 4Gbps or 8Gbps FibreChannel. This type of load is why 15,000-rpm serial-attached disks and solid-state drives (SSDs) exist.
Because of this, it's absolutely critical to monitor the transactional disk load your applications exert on their storage. Make sure you perform the monitoring for long enough such that you see the spikes generated by transient events such as month-end processing and backups. These are the events that will really push your storage platform and will often define whether its implementation has been a success.
If you have never done this kind of monitoring in your environment before, there's a good chance that your conceptions of an app's importance will not match the amount of resources they consume. I'd need a few more hands to count the number of times that I've seen an e-mail server generate as much or more transactional disk load than a critical line-of-business application.