At first, the results weren’t coinciding, and the students had to learn more about how to parallelize effectively and clean up what they had already coded. “We missed some important relationships at first,” Dr. Rahman says. With some help from MATLAB, it took two graduate students about a month to get the app parallelization right.
Dr. Rahman feels that the team’s diverse expertise was a large factor in the project’s success. One of the grad students had deep knowledge of molecular-level data quality, biomarkers, and the relevance of different data types; another offered a lot of hardware expertise; and the IT person had much experience interacting with vendors effectively. MATLAB provided help in determining which toolboxes were relevant to the task.
“When we went to MATLAB, they were just getting started with HPC,” Dr. Rahman says. “I hope they will start to pay more attention, as it would be nice if they were all ready so we didn’t have to spend months on this.”
There were also hardware communications glitches.
“At first we had some problems controlling the servers as they talked to each other and the head node,” Dr. Rahman says. “Sometimes they wouldn’t respond. In other cases we wouldn’t see any data coming through.” Solving the problem took a lot of reconfiguring and reconnecting. “Perhaps we were giving the wrong commands at first. We’re not sure,” he adds. There were also problems with incorrect server and software license manager configurations.
Dr. Rahman says that managing the cluster has been relatively trouble-free with Windows Compute Cluster Server 2003 and adds that if he could do this all over again, he’d send his students to Microsoft for a longer time to learn more of what Microsoft itself has discovered about building clusters with HP servers. The use of HPC has enabled ARI researchers to dive much more deeply into molecular data, not only analyzing differences in relationships among disparate classes of subjects, but also revealing more subtle but important variations within each class.
Uptime counts for Merlin
Whereas most HPC implementations are the province of scientists and engineers hidden away in R&D departments, Merlin Securities’ HPC solution interfaces directly with its hedge fund customers. That’s why 24/7 uptime and security was a key HPC design requirement for Merlin, right alongside performance.
“We had to be extremely risk-averse in designing our cluster and choosing its components,” says Mike Mettke, senior database administrator at Merlin.
A small prime brokerage firm serving the hedge fund industry, Merlin must contend with several larger competitors that benefit significantly from the economies of scale. Morgan Stanley, Merrill Lynch, and Bear Stearns, for example, run large mainframes that analyze millions of trades at the end of the day and return reports via batch processing the next morning. Merlin stakes its competitive edge on using its HPC cluster to deliver trading information in real time and allowing customers to slice and dice data multiple ways to uncover valuable insights, such as daily analyst trading performance as compared with other analysts, other market securities, and numerous market benchmarks. “We focus on helping clients explain not only what happened but why it happened,” says CTO Amr Mohamed.