Top500 shows growing inequality in supercomputing power

The top supercomputers are getting faster while development of midlevel systems is starting to stagnate

Supercomputing power is being concentrated in a smaller number of machines, according to the latest Top500 list of high-performance computers. Keepers of the list are uncertain how to parse that trend.

The first 17 entrants in the latest supercomputer ranking produce half of all the supercomputing power on the list, which totaled over 250 petaflop/s (quadrillions of calculations per second), noted Erich Strohmaier, an organizer of the Top500 twice-yearly ranking of the world's most powerful supercomputers, speaking at a Tuesday evening panel at the SC2013 supercomputer conference,

[ Get the latest practical data center info and news with Paul Venezia's The Deep End blog. | For a quick, smart take on the news you'll be talking about, check out InfoWorld TechBrief -- subscribe today. ]

The first place entrant alone, the Chinese Tianhe-2 system, brought in 33.86 petaflops per second (quadrillions of calculations per second).

"The list has become very top heavy in the last couple of years," Strohmaier said. "In the last five years, we have seen a drastic concentration of performance capabilities in large centers."

The organizers of the Top500, however, are unsure if the trend bodes ill for supercomputing in general. Could it signal a decline in supercomputing overall, or a concentration of supercomputing's investigative powers among fewer government agencies and large companies?

"We don't know what it actually means," said Horst Simon of Lawrence Berkeley National Laboratory, one of the organizers of the Top 500. "But it is important to exhibit the trend and have a discussion."

To characterize the depth of this "anomaly" as Strohmaier called this trend, he used a measure of statistical dispersion called a Gini Coefficient, which ranks the distribution of some resource. The Gini Coefficient, which is often used to measure the wealth distribution of nations, can range from 0, where the resource is spread evenly among all the holders, to 1, where one party holds all of the resources.

The list scored a Gini Coefficient of 0.6, which is quite high, Simon noted. By way of comparison, were the Top500 supercomputers a nation, it would have a greater inequality in computation than all but a few of countries have today in terms of wealth distribution. Simon jokingly called it "the rich-getting-richer phenomenon of supercomputing."

Drilling further down into the metrics, Strohmaier found no major differences between the buying habits of governments and industry. Both parties are buying fewer midsized systems and concentrating their efforts on building fewer, larger systems.

The trend could be problematic because fewer larger systems might reduce over time the number of administrators and engineers skilled in running high-performance computers. On the other hand, it might not be problematic in that most of the largest systems are shared across multiple users, such as all the researchers from a nation's universities.

One member of the audience for the panel, Alfred Uhlherr, who is a research scientist for Australia's Commonwealth Scientific and Industrial Research Organization (CSIRO), attributed the cause to another possible factor.

A number of organizations he knows of, both governmental and industrial, decline to participate in the Top500, knowing that their systems would not rank that high on the list. Nations such as China, or companies such as IBM, can generate positive publicity for themselves to be positioned near the top of the list. For entrants that might appear on the bottom reaches of the list, the benefits of getting on the list may not be worth the efforts.

Not helping in this regard is the sometimes laborious Linpack benchmark that supercomputers are required to run to be considered for the voluntary Top500.

For instance, the U.S. Department of Energy Lawrence Livermore National Laboratory's Sequoia machine, which ranked third on the current list with 17 petaflop/s, had to run Linpack for over 23 hours to get its results, noted Jack Dongarra, another one of the list's curators, and a co-creator of Linpack.

That night, Dongarra suggested that Linpack, created in the 1970s, is no longer the best metric to use to estimate supercomputer performance. He championed the use of a new metric he also helped to create, called the High Performance Conjugate Gradient (HPCG).

"In the 1990s, Linpack performance was more correlated with the kinds of applications that were run on the high performance systems. But over time that correlation has changed. There is mismatch now between what the benchmark is reporting and what we are seeing from applications," Dongarra said.

Nonetheless, many in attendance at the conference still find the Linpack-driven Top500 viable. CSIRO's Uhlherr said his organization still studies the list closely, not so much for the Linpack ratings, but to observe which industries, such as energy companies, are using supercomputers, as a way of assuring Australia is staying competitive in these fields.

Joab Jackson covers enterprise software and general technology breaking news for The IDG News Service. Follow Joab on Twitter at @Joab_Jackson. Joab's e-mail address is Joab_Jackson@idg.com

Mobile Security Insider: iOS vs. Android vs. BlackBerry vs. Windows Phone
Join the discussion
Be the first to comment on this article. Our Commenting Policies