Fixing the holes in solid-state drives

Vendors and standards take aim at the short life spans and limited reliability of SSDs

SSDs (solid-state drives) are getting a lot of attention as tier 0 drives in storage systems, providing a leap in performance over 15K drives, albeit at a higher price per gigabyte. Aside from the high cost, downsides of SSDs in storage systems include limited endurance and limited reliability of RAID systems, issues that many vendors are working hard to ameliorate.

Endurance issues stem from the characteristics of the two primary types of flash memory available: single-level cell (SLC) and multilevel cell (MLC). Both have limited life spans in terms of the total number of writes per cell. To help improve longevity vendors, use wear-leveling algorithms to distribute writes across all of the cells in an SSD, as well as by overprovisioning drives, using 20 to 30 percent more actual capacity than is reported by the devices to mitigate the loss of used up cells.

[ How many disk drives does it take to change a lighbulb? Read the InfoWorld Storage Adviser blog entry about green IT and datacenter storage ]

Reliability is a different issue, stemming mostly from the fact that many storage systems use RAID with SSDs, which can cause failures when multiple SSDs reach their end of life nearly simultaneously.

Ironically, for SSDs, the best approach may be to create partitions that span multiple drives, instead of striping data across multiple devices. Because SSDs don't suffer from the mechanical reliability problems common to hard drives, ensuring that differing amounts of data are written to different drives in a set may produce better system reliability than RAID.

Also, because SSDs have seek times and transfer times so much better than hard drives, writing to multiple spindles (drives, actually, since there aren't spindles in SSDs) doesn't produce much of an increase in performance over single drives.

Many first-generation flash drives are not optimized for performance, especially in storage systems. In particular, they don't use cache efficiently. When writing to flash drives, multiple small writes uses up much more bandwidth and wears the drive faster than one large write, but the cache on many first-generation controllers, which should be able to optimize this, instead allows the OS to make multiple small writes because the code on the controller is taken from standard hard drive controllers, rather than being written specifically for SSDs. SSDs coming soon from Seagate, Intel, and other enterprise vendors will be optimized to provide longer lives and better performance in storage systems.

To reach the potential of the higher speed of SSDs, storage vendors will also be moving to 6Gbps SAS, which can best take advantage of the lower seek times and higher transfer rates possible for flash drives. In addition, two standards bodies -- the Storage Networking Industry Association and the Joint Electron Devices Engineering Council -- are creating standards for SSDs that address both endurance and performance.

Until prices come down, SSDs will probably continue to be used primarily as a replacement for cache buffers, increasing performance dramatically at a lower cost than high-priced cache memory. However, as enterprise-class SSDs become more available, prices should drop quickly. First-generation drives are currently selling at below cost in some cases. The new standards should help ensure that administrators are able to choose the best drives for their applications.

Related:

Copyright © 2008 IDG Communications, Inc.

How to choose a low-code development platform