Q&A: Gluster provides a platform for open source storage virtualization

Gluster upgrades open source clustered storage with virtual machine support and data storage management

Storage startup Gluster has become the latest vendor to join a recent wave of emerging companies providing enterprise storage management features to industry standard hardware and commodity-based clustered NAS systems. The company's platform is said to answer the question of scale by supporting hundreds of petabytes in a single volume.

Gluster came out of stealth mode back in 2007 when it announced GlusterFS, a general-purpose distributed file system for clustered NAS based on open source code. The technology was improved upon last May when the company announced the release of version 2 of GlusterFS. It added intelligent data placement to help optimize resources and reduce bottlenecks, introduced efficiencies into the file system for handling small files, and added optimizations for cloud storage.

[ Check out InfoWorld's latest guide that explains the ins and outs of server virtualization | And keep up with the latest virtualization news with InfoWorld's virtualization newsletter or visit the InfoWorld Virtualization Topic Center for news, blogs, essentials, and information about InfoWorld virtualization events. ]

And now the latest announcement coming from the company is around the Gluster Storage Platform, which builds on the existing management features already offered with a new Web-based management interface, a new software delivery model, and support for high availability with replicated data and self-healing with error detection and correction within files.

The company has also introduced support for virtual machines, which is what really grabbed my attention. The company mentioned that replicated virtual machines could continuously operate in the event of a hardware failure, recovery being performed in the background without requiring a restart of blocking I/O to the live VM. To find out more, I spoke with Jack O'Brien, senior director of marketing at Gluster.

InfoWorld: Tell readers a little bit about Gluster and the problems that the company is focused on.

Gluster: Gluster provides an open source clustered storage platform that installs on industry standard hardware. We are focused on simplifying the task of storing massive amounts of unstructured file data, and we do this with a solution that scales horizontally to deliver required capacity and performance with a unified global namespace.

New and emerging workloads are straining traditional storage solutions. To address this, Gluster has developed a flexible architecture that can be configured to support a very broad range of applications, whether they require performance for small files, large files, random access, or sequential access. By virtualizing storage resources into a single pool, Gluster provides an ideal storage solution to complement virtual machine environments. Our solution is based on open source software paired with commodity hardware, providing compelling cost savings compared to existing proprietary offerings.

InfoWorld: You mentioned massive amounts of unstructured file data. We're hearing more and more about the explosion of unstructured file data lately. How are datacenter managers trying to deal with this issue?

Gluster: This is a big challenge driven by the growth of Web computing, applications that rely on "big data," the exponential growth of multimedia, and the adoption of virtualization technologies. Many times the approach is to throw hardware at the problem since disks are perceived as inexpensive. While the cost per gigabyte of disk drives continues to fall, the systems and software to manage them are still expensive. This is compounded by the fact that scaling existing systems is expensive or sometimes not even possible in the face of existing limitations.

InfoWorld: So what is the latest around the Gluster Storage Platform?

Gluster: The main focus is on ease of use and making it simple for customers to deploy petabyte scale clustered storage. Gluster Storage Platform integrates the file system, an operating system layer, and a Web-based management interface and installer. Installation is a simple process that enables customers to deploy a few hundred terabytes of clustered storage in two steps and just a few mouse clicks. File system features have also been added and enhanced, the most significant of which are optimization for VM environments and providing always-on availability for VMs.

InfoWorld: One of the interesting things I find about this is its open source model. How does it benefit customers?

Gluster: The open source business model and products have matured to the point where enterprises are running mission-critical areas of their business on open source.

The primary benefits are from lower costs and the flexibility provided from reduced vendor lock-in. The nature of the open source subscription model inherently demands that the vendor provide ongoing value over the course of the relationship with the customer vs. the more transaction-oriented process of software licensing. It is common for us to work with companies with a mandate to adopt open source products. The reluctance of the past has nearly faded completely.

InfoWorld:  And since I'm all about virtualization, I have to ask you, what's unique about how Gluster addresses the problems of providing storage for virtual machine environments?

Gluster: Gluster excels at storing and managing file data, and virtual machine images are just that -- files. Managing VM environments presents several specific challenges that the Gluster Storage Platform addresses. First is the complexity of managing storage for hundreds or even thousands of VM images. In a typical environment, an administrator would provision storage and I/O every time a VM is created. This requires management of many individual volumes and LUN connections. Gluster provides a unified global namespace that virtualizes the underlying storage resources. That single volume can support thousands of VMs while automatically distributing data and load balancing the I/O. A second major challenge is ensuring good performance in an environment where many VMs are simultaneously accessing data. This often results in I/O bottlenecks and poor performance. With the Gluster Storage Platform, data access is scaled horizontally across multiple storage nodes with automatic I/O scheduling and load balancing to eliminate choke points. A third issue is the current cost of storage systems and the common need for multiple storage silos, required to deliver the necessary application performance and reliability. Gluster provides a solution that combines open source software with commodity hardware for compelling cost savings. Additionally, the Gluster file system includes replication to survive hardware failures and self-healing to ensure VMs are always on.

Again, I'd like to thank Jack O'Brien from Gluster for taking time out to speak with me about this latest announcement.

Copyright © 2009 IDG Communications, Inc.

How to choose a low-code development platform