Earlier this month, Red Hat announced it had acquired Gluster, developer of the GlusterFS open source file system and the Gluster Storage Platform software stack. In so doing, Red Hat set itself up as a one-stop shop for those looking to deploy big data solutions such as Apache Hadoop. But it also bought a file system that has serious potential for cloud-based deployments. If you haven't heard of Gluster yet, here's a quick look at what makes it different than most other scale-out NAS solutions.
A quick tour of Gluster
In Gluster's own words, GlusterFS is "a scalable open source clustered file system that offers a global namespace, distributed front-end, and scales to hundreds of petabytes without difficulty." That's a big claim, but GlusterFS is built to solve big problems -- really big problems. In fact, Gluster's maximum capacity is somewhere in the neighborhood of 72 brontobytes (yeah, that's a real word).
Perhaps the most important detail to know right off the bat about GlusterFS is that it accomplishes absolutely massive scale-out NAS without one thing that pretty much everyone in the big data space uses: metadata. Metadata is the data that describes where a given file or block is located in a distributed file system; it's also the Achilles' heel of most scale-out NAS solutions.
In some cases, such as Hadoop's native HDFS, metadata constitutes a dangerous single point of failure. In others, it's a barrier to truly linear performance scalability, because all nodes must continuously stay in contact with the server(s) that hold the metadata for the entire cluster -- which almost always results in additional latency and storage hardware that sits idle waiting for metadata requests to be fulfilled.
For applications that require the ability to survive the loss of a node, Gluster can also be deployed in a distributed replica mode that resembles file-level RAID10. In this model, files are distributed over pairs of nodes that are synchronously mirrored. An individual node can be lost and replaced without file availability being impacted.
Finally, Gluster supports a striping mode that operates more akin to a standard block-level RAID0. This mode is generally only recommended in situations where very large files (typically exceeding 50GB) are being stored and where the performance of multiple nodes is required. This is the only mode that will ever divide a file and split it over multiple nodes -- all other modes operate at a file level. Unfortunately, mirroring is not supported in combination with striping, so high availability must be built into the hardware if this method is to be used.
Alhough you can't mix storage modes within the same Gluster cluster, it is possible to run multiple logical clusters on the same set of hardware. Thus, you could potentially run a distributed replica cluster in parallel with a striped cluster on the same physical hardware.
In addition to allowing the implementation of distributed replication within a Gluster cluster, it's also possible to implement N-way geo-replication in between clusters. This method can be used to protect against the failure of an entire site or allow easy migration of applications from one site to another. Gluster geo-replication is very flexible, allowing replication models that include an arbitrary number of intermediate replicas (Site A to Site B, Site B to Sites C and D, for example).
It should be noted that it is possible to stretch a Gluster cluster across physical sites, but due to the synchronous nature of intracluster distributed replication, large amounts of WAN bandwidth and very low latency would be required to achieve reasonable performance. In practice, then, a single Gluster would generally be limited to a single site or metro area.
Since clients are only attached to a single node at a time, read and write requests must be shuffled between that node and other nodes actually storing the data -- a situation that can result in substantially decreased performance compared to using the native client. As a result, deployments using these protocols usually require a separate back-end network that is dedicated to handling the internode traffic necessary to respond to client requests.
Gluster is managed through a combination of a Web GUI that ships with the bare-metal Gluster Storage Platform and a set of command-line tools that are available with the stand-alone GlusterFS distribution. As such, it's best managed by those already familiar with Linux system administration. For someone with some Linux chops, it's amazingly simple to use, requiring only a few quick commands to make fairly large changes such as adding a new node or creating a new volume. In fact, the well-known Internet radio company Pandora deployed a 250TB Gluster-based storage back end for its service and has only a single admin dedicated to managing it. If you have some Linux skills and an hour or two, you can implement Gluster. How many other clustered file systems can you say that about?
Applicability in the cloud
Aside from its obvious applicability in building a storage back end to support a cloud environment, Gluster has some neat applications within existing public cloud infrastructures. One of the challenges in building a highly available storage system using a cloud infrastructure like Amazon EC2 is that you really need to bring your own disaster recovery plan. While Amazon in particular offers great reliability for its S3 object-based storage platform, it cannot offer the same service levels for its online EBS (Elastic Block Storage) product that backs most EC2 compute instances. Additionally, EBS volumes are limited to 2TB in size, which can make it difficult to work with large datasets.