The new NAS: Fast, cheap, and scalable

New network attached storage technologies keep unstructured data under control

There are many reasons to complain about storage, but lack of variety is not one.

Never before has the market offered so many different NAS (network attached storage) solutions, ranging from consumer-grade desktop boxes to sophisticated clustered solutions. This incredible variety of NAS products is the market reaction to customer demand, fueled by relentless growth in the amount of data being stored digitally.

According to a study published in 2003 by the University of California at Berkeley, about 5 exabytes (that’s 5 million terabytes) of new data were created and stored in 2002. Another recent study, this one conducted by the Enterprise Strategy Group, predicts that the overall volume of data archived by government and corporations will grow to 27 exabytes by the year 2010.

Where is all that data being stored? Apart from a few exceptions, notably e-mail messages and transactional records, it doesn’t end up in a database. Most of the new information clogging our storage arteries is unstructured data created in a variety of formats, including images, sound snippets, data series, and, of course, office documents.

To keep up with this massive influx of unstructured data, storage vendors are reacting with products that offer more storage for the buck and place equal importance on serving files as on serving blocks of data. In fact, many vendors are combining those two equally important data-handling approaches in a single product.

Enter the world of unified storage. These next-generation solutions are capable of dishing out SAN volumes to attach to your servers for high transactional database performance while reading and writing files for a variety of clients. Only a few years ago, Network Appliance, the pioneer in this space, was unique in offering unified storage. Today various vendors have followed in NetApp’s footsteps, offering combined NAS plus SAN solutions targeting customers ranging from home users to Fortune 500 companies.

Click for larger view.
These new systems, in addition to clustered NAS solutions that offer improved performance and scalability, represent the best answer yet for enterprises looking to contend with the ever-increasing flow of unstructured data (see case study).

A new generation of NAS

“Our customers are dissatisfied with the cost and complexity of traditional storage systems,” says Brett Goodwin, vice president of marketing and business development at Isilon Systems. Goodwin adds that a company such as Kodak, which manages more than 1 billion digital images shared across 23 million online users, could never have solved its business problems given the performance and scalability limitations of traditional systems.

In its simplest form, a NAS system for corporate use consists of a standard x86 server running a modified flavor of Linux or Microsoft WSS (Windows Storage Server). The server can either share an enclosure with disk drives or reside in its own dedicated enclosure and add storage capacity with external modules. Resilient solutions add an additional server for active fail-over.

Unified storage solutions give customers more flexibility by offering a common capacity bucket for file systems and SAN volumes alike, making it easier to reach higher utilization on storage devices. These systems frequently handle SAN via iSCSI, a protocol that is a good complement to the traditional file serving protocols for Linux/Unix and Windows, although some products — such as the FAS line from NetApp — also include Fibre Channel connectivity. On the client side, ubiquitous iSCSI initiator software and Gigabit Ethernet NICs make for inexpensive SAN access.

Even these systems, however, can only scale up to a point. They may not satisfy customers who need to manage large files in big volumes quickly. For example, to work around a file system that can’t grow beyond 16TB, customers may have to break their files across multiple file systems, which increases complexity and vulnerability.

“Namespace aggregation and virtualization create the illusion of a single unified system but don’t solve underlying technical constraints,” Goodwin says.

Reaching out to the cluster

To meet the demand of customers such as Kodak and others, Isilon designed IQ, a clustered storage solution with ambitious goals. IQ is designed to support file systems 20 to 50 times larger and 15 to 20 times faster than most NAS solutions. Its OneFS file system is highly reliable and self-healing, with a symmetric, distributed architecture that makes managing thousands of terabytes easy. In addition, IQ includes diversified cluster nodes that maximize performance or capacity and interact with fast InfiniBand connections on the back end.

It’s worth noting that, although Isilon nodes mount SATA (serial ATA) drives, OneFS doesn’t rely on traditional hardware RAID and can recover quickly from two simultaneous failures. “With a 10-node cluster, we can rebuild a 500GB SATA drive in two and half hours,” Goodwin says. “It could take up to 24 hours on a traditional NAS system.”

Moreover, Isilon IQ 4.0 clusters can grow in capacity to as much as 528TB simply by adding more nodes, a quick operation that doesn’t affect running applications (see “Scalable NAS: Just What the Doctor Ordered,” page 34). Files are striped across all nodes, which translates into faster performance as the cluster grows. It also speeds recovery operations because more boxes will be working to rebuild the missing files.

Isilon’s approach isn’t the only way to handle clustered NAS, however, and it may not be the best way to address every storage scenario. For example, an Isilon system has excellent sequential performance because every node contributes to moving a large file quickly, but random-access performance may not be as good.

The ONStor Bobcat Series NAS Gateway is another kind of clustered solution that takes a completely different approach. “Our architecture excels with random performance,” says Jon Toor, ONStor’s vice president of marketing.

ONStor isolates file systems management from the underlying storage. In fact, the Bobcat gateways can consolidate capacity from many popular storage arrays, which makes it a convenient solution to properly use already-installed equipment while improving NAS performance, scalability, and manageability.

Each Bobcat can create numerous virtual servers that isolate specific applications or performance targets. Moving virtual servers across gateways is transparent to the user and simplifies balancing an uneven load or optimizing performance for a demanding virtual server.

Not all NAS is created equal

It’s fair to say that clustered NAS may not be a major purchase target at present, but that hasn’t stopped many vendors from improving their NAS products in other ways. For example, earlier this year EMC released the MPFSi (Multipath File System over iSCSI), a solution that channels access to NFS and CIFS file systems over iSCSI.

MPFSi adds some layer of complexity to NAS because it requires Fibre Channel-to-iSCSI bridging on a Connectrix MDS switch and both a proprietary agent and iSCSI Initiator software to be loaded on each client, but EMC claims performance improvements as much as four times better than plain NAS.

“[MPFSi is] not a clustered NAS approach, but it goes after the same problem,” says Brian Garrett, technical director of the Enterprise Strategy Group’s ESG Lab.

Similarly, Sun recently added the StorageTek 5320, a NAS model based on a new AMD architecture that performs “at least 50 percent higher than the previous 5310,” according to Sun, and pushes scalability up to 179TB.

Other vendors are either nurturing existing solutions based on parallel file systems — Exanet, IBM, Ibrix, Panasas, and SGI come to mind — or are forming close partnerships with other vendors to bring that technology within their offering, such as Hewlett-Packard’s and Microsoft’s relationships with Polyserve.

Will these and the other clustered NAS mentioned earlier remain just niche solutions, indispensable to a few customers but largely ignored by the majority of companies? Perhaps the most interesting answer to that question comes from NetApp, a company that ironically is often the target of many “beat the old-fashioned NAS” campaigns. (Remember the “NotApp” ad from Polyserve?)

“Customers’ needs, especially large, skilled customers, are clearly now at a stage where they need solutions that go beyond the scale of one or two boxes,” declares Rich Clifton, vice president and general manager for NetApp’s enterprise datacenter and applications business unit. “But they also need no compromise in the simplicity of management and the simplicity of deployment.”

NetApp’s forthcoming product, code-named GX, will be a new solution based on those criteria, according to Clifton.

“GX is a breakthrough architecture able to assemble a set of storage devices and allowing customers to treat them as one,” Clifton explains. “This is a system designed to be very scalable to very high node counts and very high IOPS [I/O operations per second].”

If NetApp can deliver on its promises, the release of GX, or whatever will be the official name of that product line, could mark the beginning of a new era — one in which clustered NAS solutions are not just a class of sophisticated toys for scientists and researchers but a commonplace enterprise tool. If the expected mass increase in unstructured data is any indication, many companies are likely to add next-generation NAS to their toolboxes in the coming years.

Copyright © 2006 IDG Communications, Inc.