This is part 2 of Patrick O'Rourke's (Lead Product Manager of the Windows Server Division at Microsoft) response to my questions about Microsoft's new clustering product announcement.
2) Grid Meter: High performance computing is not just about raw horsepower. It's about moving massive amounts of data from point to point, interoperability with other systems, and presentation of data and computational resources as virtualized entities so processes that need to access them don't have to spend a lot of time figuring out how to do it. How does Windows Compute Cluster Server 2003 address these data / interoperability, related considerations?
O'Rourke: In terms of moving massive amounts of data, Windows CCS will scale well beyond the mainstream division and department clusters. We already have orders from customers for 1,000+ nodes. And just last week InformationWeek reported:
"The group of Dell machines, which runs Microsoft's new Windows Compute Cluster Server 2003 software, contains 896 processor cores and can perform 4.1 trillion computations per second. The system has a very good chance of making the closely watched Top 500 list of the world's most powerful supercomputers, according to Jack Dongarra, a computer science professor at Tennessee who maintains the list."
There are a number of items in terms of interoperability:
We have a partnership with Platform Computing (the leading vendor for job schedulers) and have done the joint work needed to ensure our respective job schedulers can communicate with one another. For Microsoft customers not doing HPC today, they'll find that seamless integration with Windows Compute Cluster Server 2003 provides "meta-scheduler" functionality for multi-cluster scheduling and bi-directional job forwarding. Existing Platform LSF customers will find that seamless integration of LSF with Windows Compute Cluster Server 2003 enables transparent job forwarding between workgroups and the data center.
Windows Compute Cluster Server 2003 can take advantage of the 64-bit version of Services for UNIX 3.5 in order to run UNIX or Linux applications.
Windows Compute Cluster Server 2003 uses Active Directory for authorization and authentication of cluster users. If the IT environment in which the cluster is being installed does not already include an AD domain, users can set up an AD domain controller on the head node of the cluster. Microsoft also provides tools to integrate AD with other systems such as NIS.
Microsoft's Message Passing Interface (MPI) includes open source code based on MPICH2, a widely used reference implementation. By basing our own MPI on this standard, we have made it easier for ISVs and developers to port existing HPC codes to Windows Compute Cluster Server 2003. We are actively working with Argonne National Lab, including funding and training on Security Reviews per the TwC initiative and will contribute code back to open source.
In terms of custom development - which is big in financial services, government and academia - Visual Studio 2005 includes a parallel debugger and support for OpenMP.
Last, Windows CCS supports GigE, Infiniband and Myrinet.