A first look at Windows Compute Cluster Server

Microsoft is pushing high-performance computing aggressively, but just what does it have to offer?

It used to be that building a usable compute cluster took plenty of money, skills, and space in the datacenter. Although creating the actual applications that run on the cluster can still be difficult, nowadays building a Linux-based cluster is generally quite simple. Commercial and open source clustering packages abound with features, open protocols, and streamlined installs. No surprise, then, that Microsoft wants a piece of this potentially lucrative market.

I recently got a chance to test drive Windows CCS (Compute Cluster Server), currently in beta and scheduled for general release sometime in 2006. CCS is made up of several tools layered onto a standard Windows Server 2003 build. In fact, deploying a cluster node is identical to building a standard server and then applying the clustering package, which will be available for purchase separately.

As is usual for Microsoft, the new clustering tools leverage a number of existing Microsoft technologies. ICS (Internet Connection Sharing) on the cluster head provides NAT for the cluster nodes, which exist on a private network. RIS (Remote Installation Service) provides unattended cluster node installations from the head node. The cluster management console is a plug-in to Microsoft Management Console. All authentication is provided by Active Directory, which allows for quick integration into an AD network.

The backbone of CCS, however, is Argonne National Laboratory’s MPICH2 message passing interface. Microsoft has done a significant amount of work to bring it to Windows and, more interestingly, has contributed that code back to the project. Kudos.

Setup and configuration of a CCS cluster uses a task-list approach, walking users through the necessary steps. As of now, this process is a bit too automated for my taste, leaving those with clustering experience wondering exactly what’s going on in the background -- and nowhere to look when troubleshooting. The current build of CCS is also quite raw in some places, such as the job scheduler, but then, it is still in beta.

I haven’t had too much time to work with CCS in the lab, but so far I’ve managed to build a basic distributed DSA (Digital Signature Algorithm) key generation app running across the cluster, depositing generated keys to a common network location. Much further testing and more complex code will be necessary to truly put CCS through its paces, however.

To test the software, Microsoft provided a RocketCalc Saturn four-node personal cluster equipped with eight AMD Athlon64 2GHz CPUs and 8GB of RAM. The Saturn is a cool piece of hardware regardless of the OS, and it highlights the market Microsoft seems to be targeting: the minicluster. Instead of running one large cluster in the datacenter, it’s feasible to deploy something like the Saturn to an individual engineer’s cube. Could this style of clustering become the eventual core market for CCS? Time will tell.