Over the last several weeks, I've delved into forgotten aspects of building an IP storage network and how to best leverage it with both NFS and iSCSI -- the two dominant IP storage protocols used in virtualization. Throughout that time, I've received a bunch of queries from readers, all united by one question: Which is better, NFS or iSCSI?
As with many hotly debated IT subjects, the choice between any two popular competing technologies is less about which is better overall and more about which is best for solving the challenge at hand. NFS and iSCSI are no different. Both have strengths and weaknesses depending on the situation. But the future of storage -- which will make geographically diverse storage clustering a reality -- may significantically factor into your choice of protocol.
File vs. block
As I mentioned in last week's post, NFS and iSCSI couldn't be much more different, either in their implementation or history. NFS was developed by Sun Microsystems in the early 1980s as a general-purpose file sharing protocol that allowed network clients to read and write files to a server across a network. iSCSI came along much later, in the early 2000s, as an IP-based alternative to Fibre Channel -- which, like Fibre Channel, encapsulates block-level SCSI commands and ships them across a network.
The key difference is where the file system is implemented and managed. In file-level implementations such as NFS, the server or storage array hosts the file system, and clients read and write files into that file system. In block-level implementations such as iSCSI and Fibre Channel, the storage array offers up a collection of blocks to the client, which then formats that raw storage with whatever file system it decides to use.
Though this distinction has many ramifications, perhaps the most important is that in block-level protocols such as iSCSI (and Fibre Channel), the storage array generally isn't aware of what it is storing. All it knows is that it has allocated a collection of blocks and which iSCSI client(s) might have access to them. Conversely, in file-based protocols such as NFS, the storage array has full visibility to all of the application data stored on it -- whether that's general file sharing data or the files that might make up a collection of virtual machines.
On the network
NFS and iSCSI are also significantly different from a networking perspective. With NFS, additional throughput and redundancy are achieved primarily through network-based link aggregation and careful attention to balancing storage connections over multiple array-side IP address aliases to ensure the load balancing is effective. iSCSI, on the other hand, has built-in multipathing capabilites and, when used with vendors that provide support for it, can supply more advanced load balancing algorithms that can balance storage traffic intelligently over many server and array-side storage paths.
In both cases, the use of 10Gbps Ethernet can lessen the importance of multipathing for storage performance reasons for the vast majority of organizations for whom throughputs approaching 1GBps are simply unthinkable (at least today). However, iSCSI retains an edge over NFS in this area -- especially when aggregating multiple 1Gbps Ethernet links.
From a network security standpoint, iSCSI also has an edge. In addition to source-IP based security restrictions that both NFS and iSCSI support, iSCSI has built-in support for bidirectional challenge handshake authentication protocol (CHAP), which prevents unauthorized servers from attaching to storage resources and allows servers to validate the authenticity of the storage array they're connecting to.
One common misconception about modern NFS implementations is that they are UDP/IP based. This often springs from the fact that NFS version 2 was -- except for a few custom implementations -- entirely UDP-based. While UDP is a relatively low-latency IP transport, it also lacks the security and delivery assurance benefits that the stateful connection tracking present in TCP/IP offer. Starting in NFS version 3, TCP became a supported transport. This is what most NFS-based storage arrays and hypervisors, such as VMware vSphere, use today, putting NFS on par with the TCP/IP-based iSCSI.
Looking toward the future
Today, iSCSI would seem to be the clear winner -- at least from a networking perspective, because it delivers better multipathing support and a higher degree of end-to-end security. Yet NFS retains a significant advantage when it's leveraged properly on the array side, because it gives the array visibility into what the virtualization stack is doing with its storage and can intelligently participate in accelerating, snapshotting, and deduping that storage. It may be that those array-based intelligence benefits, combined with multipathing and security improvements coming to NFS client implementations when NFS 4.1 arrives, end up tipping the scales in NFS's favor over the long run.