Once the domain of monolithic, multimillion-dollar supercomputers from Cray and IBM, HPC (high-performance computing) is now firmly within reach of today’s enterprise, thanks to the affordable computing power of clustered standards-based Linux and Microsoft servers running commodity Intel Xeon and AMD Opteron processors. Many early movers are in fact already capitalizing on in-house HPC, assembling and managing small-scale clusters on their own.
Yet building the hardware and software for an HPC environment remains a complex, highly specialized undertaking. As such, few organizations outside university engineering and research departments and specialized vertical markets such as oil and gas exploration, bioscience, and financial research have heeded the call. No longer borrowing time on others’ massive HPC architectures, these pioneers, however, are fast proving the potential of small-scale, do-it-yourself clustering in enterprise settings. And as the case is made for few-node clusters, expect organizations beyond these niches to begin tapping the competitive edge of in-house HPC.
The four case studies assembled here illustrate the pain and complexity of building a successful HPC environment, including the sensitive hardware and software dependencies that affect performance and reliability, as well as the painstaking work that goes into parallelizing serial apps to work successfully in a clustered environment.
Worth noting is that, although specialized high-performance, low-latency interconnects such as Myrinet, InfiniBand, and Quadric are often touted as de-facto solutions for interprocess HPC communications, three of the four organizations profiled found commodity Gigabit Ethernet adequate for their purposes -- and much less expensive. One in fact took every measure possible to avoid message passing and cutting-edge interconnects in order to enhance reliability.
New to the HPC market, Microsoft Windows Compute Cluster Server 2003 proved appealing to two organizations looking to integrate their HPC cluster into an existing Microsoft environment. So far, results have been positive.
Finally, one organization found that delegating much of the hardware and software configuration to a specialized HPC hardware vendor/integrator made the whole process considerably easier.
BAE Systems tests and tests some more
When it comes to delivering advanced defense and aerospace systems, the argument in favor of developing an in-house HPC cluster is overwhelming. Perhaps it’s not surprising then to find that the technology and engineering services group at BAE Systems already has a fair amount of experience constructing HPC clusters from HP Alpha and Opteron-based Linux servers. Integrating previous HPC systems into the , a U.K.-based global defense company’s enterprise, however, has proved costly.
“We’ve found the TCO implications of maintaining two or more disparate systems -- such as Windows, Linux -- and Unix, to be too high, particularly in terms of support people,” says Jamil Appa, group leader of technology at engineering services at BAE. “We’re looking to provide a technical computing environment that integrates easily with the rest of our IT environment, including systems like Active Directory.”