We've all heard about how software-defined networking will let us run applications and custom code on our switches, but there haven't been many examples of exactly what that means in the real world.
In this week's New Tech Forum, Andrew Warfield, co-founder and CTO of Coho Data takes us through how Coho uses SDN applications to get past bottlenecks imposed by flash storage and file-sharing protocols such as NFS. -- Paul Venezia
Data, storage, and SDN: An application example
A lot of the usual discussion around SDN is about how the introduction of more flexible networks is going to solve a bunch of very real problems from a network operations perspective. The rise of server virtualization has made provisioning and isolation of network resources harder than it already was, and SDN promises to make it better. Similarly, large organizations like Microsoft and Google are talking about the wins they're getting in terms of explicit, wide-area traffic management within large-scale enterprise networks.
What's really exciting about these discussions is the idea that new applications will emerge. Fine-grained, data path programmability in the network might actually change how we approach broader issues in application design, and emerging applications might integrate directly with network infrastructure. To date, though, there haven't been many clear examples of application use cases for the exciting new functionalities that SDN switching supposedly offers.
In this article, I'd like to tell you a story of a storage application use case. For the past two years, we've been working on an enterprise storage system that embeds SDN switching hardware directly within our platform. We've worked with OpenFlow and with chip-set APIs to manipulate TCAMs (Ternary Content Addressable Memory) and forwarding tables directly. Despite the fact that SDN hasn't been broadly deployed in many of our customer networks yet, we're able to distill concrete value out of today's SDN switch hardware by using it as an embedded component of our storage system. And it is paying off spectacularly.
Why the network hasn't (really) mattered until now
Let's start with some background. For the past two decades, vendors have built big boxes full of spinning disks, aggregating them together with techniques like RAID, then exporting some abstraction like a virtual block device or file system over the network.
From a performance perspective, these spinning disks are awful. If you access them sequentially, modern disks will offer you about 100MBps of data -- in other words, at best, a single disk can about saturate a 1Gb link. Unfortunately, this never happens. Random I/O incurs seeks on disks and throughput falls through the floor. With random I/O, that same disk will deliver 2MBps or less. The broad deployment of virtualization has meant that enterprise storage systems are serving more concurrent workloads (lots of VMs) and more opaque workloads (virtual hard disks, instead of individual documents), broadly known as the "I/O blender" effect.
As a result, in almost all situations, running a single fat pipe between the array and the network has been sufficient. The bottleneck has always been the disks.
Suddenly, flash is a problem
We've had flash hardware in storage for about 10 years. Early flash was expensive and unreliable. It was great at random access, but otherwise performed a lot like disks, and that's exactly how it was treated in storage systems. Vendors replaced some of the spinning disks with SSDs and generally used those SSDs as a cache. Business as usual, the SSDs and disks shared a pretty slow SAS or SATA bus, which still had aggregate performance that could fit on a 10Gb connection.
Then everything changed. In 2010, we began to see the emergence of PCIe-based flash hardware. Flash devices moved off of the storage bus altogether and now share the same high-speed interconnect the NIC lives on. Today, a single enterprise PCIe flash card can saturate a 10Gb interface.
This is one of the predicating observations that we made a few years ago in starting our company: Storage was about to change fundamentally from a problem of aggregating low-performance disks in a single box into a challenge of exposing the performance capabilities of emerging solid-state memories as a naturally distributed system within enterprise networks. By placing individual PCIe flash devices as addressable entities directly connected to an SDN switch, our approach has been to promote a lot of the logic for presenting and addressing storage into the network itself.
How SDN solves storage problems
The initial customer environment for storage that we are trying to address is that of a virtualized NFS-based environment. VMware, for instance, is deployed across a bunch of hosts and is configured to use a single, shared NFS server. How can we take advantage of SDN in order to allow expensive PCIe flash to be shared across all of these servers and avoid imposing a bottleneck on performance?
Problem 1: The single IP endpoint. We can't change the client software stack on a dominant piece of deployed software like ESX. Scalability and performance have to be solved in a way that supports legacy protocols. IP-based storage protocols like NFS bake in an assumption that the server lives at a single IP address. In the past, people have built special-purpose hardware to terminate and proxy NFS connections in order to cache or load-balance requests, but SDN allows us to go further.
The NFS server implementation in our system includes what is effectively a distributed TCP stack. When a new NFS connection is opened to the single configured IP address, an OpenFlow exception allows us to assign that connection to a lightly loaded node in our system. As the system runs, our stack is free to migrate that connection, interacting with the switch to redirect the flow across storage resources. As a result, we are able to offer the full width of connectivity through the switch as a path between storage clients and storage resources. This approach is similar to proposals to use OpenFlow as the basis of load balancing, with the difference that it is the application itself that is driving the placement and migration of connections in response to its own understanding of how those connections can best be served.
This decoupling of client connections from a specific storage controller at the end of the wire solves an immediate scalability problem that until now has needed either interface changes on the client (NFSv4 delegation or PNFS) or complex administration (carefully splitting a namespace across several controllers). It also allows us to treat stored data as a completely fluid resource: as client connections can be moved in response to load and access pattern, so can the underlying data. As a result, OpenFlow provides the flexibility to dynamically adapt and scale the system over time.
Problem 2: High-performance multitenant isolation. I worked with one of my co-founders, Keir Fraser, to develop the Xen virtual machine monitor when we were graduate students at the University of Cambridge. When we were working on Xen, we spent a lot of time focused on the fact that a hypervisor really had a single job: isolated sharing. The hypervisor needed to take a server that was over-resourced for any single application and allow it to be safely shared among many concurrent tenants.
SDN is extending this isolated sharing for virtual machines out to the network. It's allowing the isolated sharing of network resources, as well as entire distributed systems of VMs to be provisioned and managed as a unit. In this regard, the VMs involved are actually just a resource above the virtual network that connects them. By the same measure, storage resources and data itself can be another such isolated resource.
By virtualizing network-attached flash resources to isolated networks -- be they OpenFlow-defined, NSX-based end-system tunnels, or even (gulp) VLANs -- we benefit from the ability to take expensive and high-performance storage resources and map them directly to the systems that consume them. In storage, sharing resources this way has always required some form of central mediator, with the side effect of always inherently having a bottleneck in performance.
As an example, binding virtualized networks to virtualized flash means that alongside a reliable and scalable NFS instance, an alternate tenant can have direct access to virtual flash resources and integrate directly into their application stack. Isolation in this manner lets us deploy a storage system that both supports legacy protocols and allows new, more efficient presentations of data to be developed along side, all on the same hardware.
How will SDN and applications evolve?
I think we all expect SDN to result in significant change from an application perspective in data center networks. However, there seems to be an idea floating around that this will surface as some sort of SDN "app store," where you download and install exciting new types of functionality for your network. As we start to see customer networks adopt and deploy SDN, our products will be able to more broadly integrate and achieve higher degrees of data-center-wide performance management.
Through the coming years, I really hope the SDN community will continue to evolve standards quickly, and systems will stay implementation-focused around rough consensus and running code. Most of all, I hope that everyone -- from the people implementing OpenFlow controllers and clients to chip-set vendors that are building spectacularly cool data path functionalities -- continue to think about applications. There is an unfortunate tendency in building standards to avoid exposing features that you may regret and have to support later.
My sense is that applications will make SDN succeed. The more functionality you can give application developers, the more they will make SDN work for everyone.
New Tech Forum provides a means to explore and discuss emerging enterprise technology in unprecedented depth and breadth. The selection is subjective, based on our pick of the technologies we believe to be important and of greatest interest to InfoWorld readers. InfoWorld does not accept marketing collateral for publication and reserves the right to edit all contributed content. Send all enquiries to email@example.com.