Once the domain of monolithic, multimillion-dollar supercomputers from Cray and IBM, HPC (high-performance computing) is now firmly within reach of today’s enterprise, thanks to the affordable computing power of clustered standards-based Linux and Microsoft servers running commodity Intel Xeon and AMD Opteron processors. Many early movers are in fact already capitalizing on in-house HPC, assembling and managing small-scale clusters on their own.
Yet building the hardware and software for an HPC environment remains a complex, highly specialized undertaking. As such, few organizations outside university engineering and research departments and specialized vertical markets such as oil and gas exploration, bioscience, and financial research have heeded the call. No longer borrowing time on others’ massive HPC architectures, these pioneers, however, are fast proving the potential of small-scale, do-it-yourself clustering in enterprise settings. And as the case is made for few-node clusters, expect organizations beyond these niches to begin tapping the competitive edge of in-house HPC.
The four case studies assembled here illustrate the pain and complexity of building a successful HPC environment, including the sensitive hardware and software dependencies that affect performance and reliability, as well as the painstaking work that goes into parallelizing serial apps to work successfully in a clustered environment.
Worth noting is that, although specialized high-performance, low-latency interconnects such as Myrinet, InfiniBand, and Quadric are often touted as de-facto solutions for interprocess HPC communications, three of the four organizations profiled found commodity Gigabit Ethernet adequate for their purposes -- and much less expensive. One in fact took every measure possible to avoid message passing and cutting-edge interconnects in order to enhance reliability.
New to the HPC market, Microsoft Windows Compute Cluster Server 2003 proved appealing to two organizations looking to integrate their HPC cluster into an existing Microsoft environment. So far, results have been positive.
Finally, one organization found that delegating much of the hardware and software configuration to a specialized HPC hardware vendor/integrator made the whole process considerably easier.
BAE Systems tests and tests some more
When it comes to delivering advanced defense and aerospace systems, the argument in favor of developing an in-house HPC cluster is overwhelming. Perhaps it’s not surprising then to find that the technology and engineering services group at BAE Systems already has a fair amount of experience constructing HPC clusters from HP Alpha and Opteron-based Linux servers. Integrating previous HPC systems into the , a U.K.-based global defense company’s enterprise, however, has proved costly.
“We’ve found the TCO implications of maintaining two or more disparate systems -- such as Windows, Linux -- and Unix, to be too high, particularly in terms of support people,” says Jamil Appa, group leader of technology at engineering services at BAE. “We’re looking to provide a technical computing environment that integrates easily with the rest of our IT environment, including systems like Active Directory.”
The group is currently assessing two Microsoft Windows Compute Cluster Server 2003 clusters -- both of which have been in testing for several months now. Tools built from Microsoft .Net 3.0 Workflow Foundation and Communications Foundation have enabled BAE engineers to create an efficient workflow environment in which they can collaborate effectively during the design process and access relevant parts of the systems from their own customized views with tools relevant to their tasks. One test bed is a six-node cluster of HP ProLiant dual-core, dual-processor Opteron-based servers; the other is a 12-node mix of Opteron- and Woodcrest-based servers from Supermicro.
If there’s anything that BAE has learned from its testing, it’s that little changes can have big performance implications.
“We’re running our clusters with a whole variety of interconnects, including Gigabit Ethernet, Quadric, and a Voltaire InfiniBand switch,” Appa says. “We’ve also been running both Microsoft and HP versions of MPI [Message Passing Interface]. We’ve found that all these elements have different sweet spots and behave differently depending on the application.” In the long run, this testing will enable the technology and engineering services group to provide other BAE business units looking to implement HPC with their own personal HPC “shopping lists.”
As for interfaces, “depending on the application, the size of your cluster (preferably small), and the types of switches you use, Gigabit Ethernet really isn’t that bad,” Appa says. His group has been using Gigabit switches from HP, which “for our purposes, are very good.”
Appa has also tested several compilers, and he cautions not to skimp on these tools: “A $100 compiler might make your code run 20 percent slower than a top-end compiler, so you end up having to pay for a machine that is 20 percent larger. Which is more expensive?”
Each of Appa’s configurations sits on three networks: one for message passing, one for accessing the file system, and one for management and submitting jobs. To access NAS, Appa uses iSCSI over Gigabit Ethernet, rather than FC (Fibre Channel), and has a high-performance parallel file system consisting of open source Lustre object storage technology. Why? “As clusters get larger and you have more cores running processes that are all reading one file on your file system, your file system really needs to scale or you’ll be in trouble,” Appa explains.
Meanwhile, Windows Compute Cluster has simplified both cluster management and user training -- which makes for additional benefits when it comes to freeing up staff for the more vital task of optimizing BAE apps. Although BAE’s software is already set up for HPC, Appa believes the whole process of parallelizing existing apps is reaching a turning point. “Our algorithms date back to the ’80s and do not make best use of multicore technologies,” he says. “We’re all going to have to reconsider how we write our algorithms or we’ll all suffer.”
Although each endeavor to bring HPC in-house will differ based on an enterprise’s clustering needs, BAE’s Appa has some sage advice for anyone considering the journey.
“You can’t assume that somebody will come along with a magic wand and give you the perfect HPC solution,” Appa says. “You really need to try everything out, especially if you have in-house code. There’s so much variation and change in HPC technology, and so much is code-dependent. You really have to understand the interaction between the hardware and software.”
Luckily, those attempting to bring HPC in-house will not be alone. “The HPC community itself is quite small and very open and willing to share valuable information,” Appa says.
Appa points out that Fluent has an excellent benchmarking site that demonstrates performance variations among various hardware and software combinations. In his case, the Microsoft Institute for High Performance Computing at the University of Southampton provided sound advice on what hardware worked and what didn’t, particularly during the beta phase.
Virginia Tech starts from scratch
At Virginia Tech’s Advanced Research Institute (ARI), constructing an HPC cluster for cancer research has been an educational experience for the electrical and computer engineering grad students involved.
Rather than make every aspect a learning experience, when it came to choose an HPC platform, the students and professors decided to stick with what they already knew: Microsoft Windows.
“Our students had already been running MATLAB and all their other programs on Windows,” says Dr. Saifur Rahman, director of ARI. “We didn’t want to have to retrain them on Linux.” As was the case at BAE Systems, there were also obvious advantages to a cluster that could integrate easily with the rest of ARI’s Windows infrastructure, including Active Directory.
Microsoft had already approached Virginia Tech to be an early adopter of Windows Compute Cluster Server 2003, so Dr. Rahman and his team said yes and started looking for the right hardware. They vetted several vendors, but when they found out Microsoft was performing its own testing on Hewlett-Packard servers, they decided to go with HP. “We knew we’d need help from Microsoft to fix various bugs,” says Dr. Rahman, “and since all their experience was on HP servers, we felt we’d have the most success with HP.”
So with help from Microsoft and HP, ARI installed 16 HP ProLiant DL 145 servers with dual-core 2.01GHz AMD Opteron 270 processors and 1GB of RAM each. On the same rack, ARI installed 1TB of HP FC storage. The rack also includes one head node, as well as an HP ProLiant DL385 G1 server with two dual-core 2.4GHZ AMD64 processors and 4GB of RAM.
As did BAE Systems, ARI decided to stick with Gigabit Ethernet for its cluster interconnect, mainly because it was what the team knew. “There are other interconnects that are faster, but we’ve found that Gigabit Ethernet is pretty robust and works fine for our purposes,” Dr. Rahman says. And after some servers overheated, ARI placed the entire cluster in a 55-degree Fahrenheit chilled server room.
ARI found parallelizing MATLAB apps to be a significant challenge requiring a number of iterations. “The students would work on parallelizing the algorithms, then run case studies to verify the results they were getting with the clustered applications were similar to results they got when they ran one machine,” Dr. Rahman says.
At first, the results weren’t coinciding, and the students had to learn more about how to parallelize effectively and clean up what they had already coded. “We missed some important relationships at first,” Dr. Rahman says. With some help from MATLAB, it took two graduate students about a month to get the app parallelization right.
Dr. Rahman feels that the team’s diverse expertise was a large factor in the project’s success. One of the grad students had deep knowledge of molecular-level data quality, biomarkers, and the relevance of different data types; another offered a lot of hardware expertise; and the IT person had much experience interacting with vendors effectively. MATLAB provided help in determining which toolboxes were relevant to the task.
“When we went to MATLAB, they were just getting started with HPC,” Dr. Rahman says. “I hope they will start to pay more attention, as it would be nice if they were all ready so we didn’t have to spend months on this.”
There were also hardware communications glitches.
“At first we had some problems controlling the servers as they talked to each other and the head node,” Dr. Rahman says. “Sometimes they wouldn’t respond. In other cases we wouldn’t see any data coming through.” Solving the problem took a lot of reconfiguring and reconnecting. “Perhaps we were giving the wrong commands at first. We’re not sure,” he adds. There were also problems with incorrect server and software license manager configurations.
Dr. Rahman says that managing the cluster has been relatively trouble-free with Windows Compute Cluster Server 2003 and adds that if he could do this all over again, he’d send his students to Microsoft for a longer time to learn more of what Microsoft itself has discovered about building clusters with HP servers. The use of HPC has enabled ARI researchers to dive much more deeply into molecular data, not only analyzing differences in relationships among disparate classes of subjects, but also revealing more subtle but important variations within each class.
Uptime counts for Merlin
Whereas most HPC implementations are the province of scientists and engineers hidden away in R&D departments, Merlin Securities’ HPC solution interfaces directly with its hedge fund customers. That’s why 24/7 uptime and security was a key HPC design requirement for Merlin, right alongside performance.
“We had to be extremely risk-averse in designing our cluster and choosing its components,” says Mike Mettke, senior database administrator at Merlin.