"Sharding," or splitting databases also helps Constant Contact scale easily, he says. "We can put a set of customers on Databases A, B and C, [which are] usually multiple instances of the same database with the same schema. We want them to be identical and on commodity hardware, to keep our operational costs low, so it's a non-event to roll out a new one. For 50,000 customers, we add two commodity database servers running MySQL," with no performance hit on other users, says Piesche.
Another vendor in this space is CommVault, which says its Simpana software platform cuts storage costs by up to 50 percent, administrative overhead by up to 80 percent and annual support costs by up to 35 percent by reducing the number of copies of data stored as well as the number of storage-related applications to buy and maintain.
Sanbolic claims its Melio5 data management platform provides high availability, application scale-out using shared-data server clusters, fast access to any size files in a variety of workloads, and is scalable to more than 2,000 physical or virtual nodes and up to 65,000 storage devices. Its Latency Targeted Allocator allows the Melio platform to share server-side flash and SSDs within storage arrays, as well as conventional hard drives, across nodes. This eliminates single points of failure and hard-to-access data and application silos, says CEO and co-founder Momchil Michailov.
Some newer vendors package their software in the form of physical hardware with disks and processors. Gridstore's storage appliances virtualize storage controllers as well as data to eliminate single points of failure and provide faster, parallel data access from many servers. This allows the number of controllers to grow, tapping unused computing power to scale performance as well as capacity. However, it currently supports only Windows and file-based storage.
Another software-based approach to scalability is distributing "slices" of data over many physical databases. Cleversafe's dsNet technology, also sold as appliances, works best with more than a petabyte of storage, made up of objects more than 50 to 100KB in size. This is ideal, says President and CEO Chris Gladwin, for applications such as photo sharing over the Web.
As hard drives get bigger and faster, flash gets bigger and more reliable, and open-source storage stacks mature, some industry watchers see fundamental changes in how organizations cope with the data flood.
With the adoption of new nonvolatile memory technologies, the need for tiering data between solid state and spinning disk will diminish as new technologies become cost-competitive with higher-end Fibre Channel and SAS disks, predicts Shetti. Higher-capacity, lower-cost SATA disks will still have a role, but he says the complexity of packaging and different software interfaces will discourage users from mixing nonvolatile memory and SATA in the same system.
Within three to five years, the price of flash drives will be somewhere around the same cost as high performance disk, says Hu Yoshida, CTO at Hitachi Data Systems. They are already at parity, he says, when the capacity of the hard drives is reduced by short-stroking (using only part of the disk capacity to speed performance by reducing the distance the read/write heads must travel to reach the data) and by writing data across multiple disks in RAID data protection configurations.
Even commodity hard drives, however, will gain speed as vendors add more cache to them. Seagate expects such "hybrid" drives to make up most of its product line by the middle of the decade.