System compensates for power-supply shortcoming with great manageability and resilience
High-speed access to large files, such as those related to scientific research, doesn’t play nice with traditional NAS systems. They're usually just too limited in their throughput and file-system capacity.
High-end file-serving solutions from companies such as Ibrix and Isilon can do that heavy lifting much better. These products are like heavy-duty pickup trucks of the storage world: They facilitate quick and steady delivery of large file payloads.
Unfortunately products in that space come at a price that can challenge many budgets, but in January, Isilon announced the IQ 200, a new entry-level version of the company's clustered storage systems. The IQ 200 promises good throughput and scalability up to 1,000TB while maintaining the same operating systems and applications of its larger siblings.
The IQ 200 is based on 1U nodes, each equipped with four 500GB SATA drives on the front. A bezel completely covers the four drives, and it hosts ready and alert LEDs, which indicate normal use or an error condition.
An IQ 200 cluster can host from three to 24 nodes each, with a nominal capacity of 2TB. Each of the four nodes I received for evaluation ran the Isilon OneFS 4.5 OS, which according to the vendor can store as much as 1,000TB of files in a single system.
The back of the unit has the typical connectors you would expect on any server. Unlike the larger Isilon products, the IQ 200 doesn’t offer Infiniband connectivity in the back end. Instead, each node has a dedicated "internal" GigE (Gigabit Ethernet) port for connecting to other nodes and a separate "external" port.
The external port should ideally be on a subnet where only application servers are connected, but OneFS has several configuration options, including merging back-end and front-end access on a single port. For my evaluation I used separated and dedicated subnets for servers and cluster nodes.
After connecting via serial port to a node, I had access to a very easy-to-use, wizard-driven, command-line management tool that guided me through configuring the two subnets, setting which DNS server and Active Directory domain I wanted to use, and assigning a name to my cluster.
After a quick reboot, I pointed my browser to the IP address of that node, which opened the GUI management console. To add more nodes, I simply clicked on Cluster Management / Add Node. A window opened with my three remaining nodes and their MAC (media access control) address listed.
I selected two more of the nodes and in seconds, my three-node cluster was ready. I left the fourth unit in standby. Setting up an IQ 200 cluster is that easy, a process facilitated by the easy-to-reach, context-oriented online help files that both the GUI and the command-line applications have.
Natively -- that is, without installing agents on application server -- OneFS supports a variety of client access protocols, including CIFS, NFS, FTP, and HTTP. Appropriately, however, they are not active unless you start them. After starting Windows file sharing in domain-access mode, I made a separate directory for each Windows Server 2003 under the generic default root, IFS.
Next I connected each server to its own directory by mapping a new network drive in Windows Explorer; nothing new to learn there. Then I launched a script on each machine to populate the new drive with files.
OneFS has many powerful features that you won’t find in conventional NAS systems. For example, the file system will automatically balance the space used across its nodes, such that each one will have a similar amount of free space. To improve performance and reliability, files are automatically striped across nodes and volumes in the cluster. In addition you can increase resilience by setting a different parity level up to four. This makes your files capable of surviving many hardware failures.
These additional parity levels apply to the whole cluster and can consume performance and capacity quicker, but OneFS has an advanced setting that makes possible different protection levels for the whole cluster, for each folder, and for each file.
OneFS offers an unlimited number of snapshots for each cluster driven by a schedule that gives great flexibility. For example, I was able to set different timing for each server’s folder and to position snapshots only one minute apart, which obviously minimizes the possibility of data loss.
I wasn’t able to try that feature with a single cluster, but OneFS also offers a synchronous replication option that can add disaster recovery capability to its excellent local recovery features.
Conveniently, all the knobs to control these applications are inside the two management interfaces, both the CLI and the GUI. The UIs also deliver tools to monitor the conditions of the hardware, access the alerts log, and read the status of components inside the enclosure, down to the SMART (Self-Monitoring, Analysis, and Reporting Technology) info for each drive.
I also loved the comprehensive set of performance monitoring tools, which allow you to analyze what’s happening on the whole cluster, on a single node, or on one drive.
Although I had planned to test how well the IQ 200's data protection features worked, I did not have to simulate a failure -- one happened naturally. A node refused to power back on after I had loaded all my test files and shut down the system for the day. The culprit was the power supply, which is a nonredundant component on the IQ 200.
I still had a spare node in stand-by, so I added it to the cluster and crossed my fingers while the system began healing itself. It took close to nine hours to redistribute almost 2TB of data, but all my files were there intact. In fact, I kept a movie clip running all that time and never noticed a hiccup.
It’s important to understand that by default, OneFS assigns a low priority to those housekeeping activities, but you can throttle the priority higher to speed up the healing process. However, during that time, I had normal access to my files, with some occasional hesitations when opening a file or changing a drive mapping on a server.
How fast is an IQ 200 cluster when not rebuilding? That varies, depending on how many nodes you have. More nodes add more spindles and more GigE ports, hence more bandwidth. The level of protection you choose also has an impact on performance, because it adds I/O operations to build the additional data redundancy.
In my tests, using Iometer and setting a 32K read-only script on five servers, I saw an overall cluster throughput of about 2.7Gbps, which is very close to the theoretical limit of the three GigE ports facing the servers. Changing the Iometer script to perform one sequential write every four IOs produced sensibly slower performance, just below 0.7Gbps.
It’s interesting to note that Isilon also offers SmartConnect, an optional load balancing application that will automatically spread servers’ traffic across the nodes. But with only five servers in my test bed, I did not use SmartConnect.
Despite the misadventure with the dead node, and the absence of a dual-power supply, I still like the IQ 200. OneFS has features that put conventional NAS systems to shame, and the cluster capability adds a level of scalability in performance and capacity that few other solutions in that price range can claim. Just buy an extra node, as long as Isilon continues to ship units without redundant power supply. It's a good idea regardless, and they don't cost that much.
Overall Score (100%)
|Isilon IQ 200||8.0||9.0||9.0||9.0||7.0||9.0|
This weekend's Windows 10 upgrade has users angry, and it's unclear if the ploy will continue
Speaking at the O'Reilly Fluent conference, Eich also endorsed the Service Workers mobile app...
You don't need a tinfoil hat, either. Opportunists have exploited consumer fears to create an industry...
But can’t live without -- here are the tools, syntax, and code that has us shaking our fists
Whether you're running a small business or just looking to improve your BYOD situation, there's an app...
If you know somebody who woke up to find Windows 10 on their computer, perhaps this advice will console...
Misconceptions and 'best practices' may have your team spinning wheels rather than continuously...