The Top500 list of the world's most powerful supercomputers passed a milestone Wednesday with the first system to achieve peak performance of 1 petaflop/s, or one quadrillion floating point operations per second.
The system, called Roadrunner, was built by IBM for the U.S. Department of Energy’s Los Alamos National Laboratory. It's based on an advanced version of the Cell processor used in Sony's PlayStation 3, and it's performance outstrips by far the previous fastest system, another IBM computer that topped out at 478.2 teraflops per second.
Erich Strohmaier, a computer scientist at Lawrence Berkeley National Laboratory, was one of the founding editors of the Top500 list back in 1993. He talked with IDG News Service about the performance gains the list has seen, the quad-core processors that are coming to dominate it, and mistakes that can creep in when the list is put together. Following is an edited transcript:
IDG News Service: Did you expect to see performance of a petaflop/s when you started this list?
Erich Strohmaier: No, 15 years ago the big question was whether all 500 systems together would amount to 1 teraflop -- and it was just above 1 teraflop, all 500 of them together.
IDGNS: Where does the performance of the IBM system come from, is it mainly the Cell processor or advances somewhere else?
Strohmaier: For the Roadrunner it's a very dense package in terms of the computing power. The advanced Cell is important, with eight of those [cores] on a single processor ... but it's also because it's tightly integrated. It's a blade system so you get a lot of these in a rack.
IDGNS: Does that cut down on latency between the blades?
Strohmaier: Yes, you lose that latency, and you also need that kind of packaging to cut down on the power. Using the Cell is one way, but using these tightly integrated blade systems is another way to control power.
IDGNS: Does someone go around and audit these systems? How do you know the results are genuine?
Strohmaier: In the first place it's an honor system, but of course for the big systems we ask them to run the benchmark and we want to see the output files.
IDGNS: Have you ever caught anyone cheating?
Strohmaier: Not on the larger-scale systems, but there are always mistakes on the list. Big companies don't really know precisely how much [equipment] they've sold where, because they don't track sales by system, they track them by components. So they know they've shipped so many blades of certain type to the U.K., but they don't know how they are configured at customer sites. So yes, there have been mistakes made.
The more common mistake is that there are still systems on the list even though they have been decommissioned, because companies don't usually tell us when they shut their systems down. The thing that keeps the list healthy is that we lose, over a typical six month interval, about 200 to 220 systems. So if we made some mistakes they'll be out of the list very quickly. This time we had record turnover, we lost 300 systems.
IDGNS: What do you attribute that to?
Strohmaier: We've seen record turnovers a few times in the last three or four lists, that's a reflection of the market adopting the new quad-core processors. It's the dominant architecture in terms of how many cores are used and it became that very quickly. Lots of these quad-cores are Intel Harpertown (the Xeon 5400 series), there are already more Harpertown systems on the list than Clovertown (the earlier Xeon 5300 series). It shows that our supercomputing community is ready to use those processors, and Linpack (the benchmark used to rank the supercomputers) can use a lot of features of the Harpertown and Clovertown quad-cores.
IDGNS: Intel seems to be increasingly dominant on the list, is that because AMD's quad-core chips were delayed coming to market?
Strohmaier: Yes I certainly agree with that, when AMD came out with their dual-core processors they had a headstart compared to Intel and gained a larger share of the list. In the last year to a year-and-a-half that has reversed and Intel's share has increased more. One reason has been the delays in AMD's quad-cores, the other is that for Clovertown, Intel introduced four floating point operations per cycle per core. AMD was late doing that; they do it now with the new quad-core but they didn't do it with the dual-core. And the Linpack benchmark and applications similar to it can use this four-floating-point feature, so they show up better on the list.
IDGNS: Was it a scramble to get the results in on time? Some people wondered if Roadrunner would be ready.
Strohmaier: Yes, for Roadrunner it wasn't too much of a scramble but they submitted it in time. But they still haven't used the full machine. The machine is in 18 segments and they used only 17 of those, so they still have room to grow in terms of doing a new measurement and squeezing out a little more. It was amazing they managed to do the petaflop.
IDGNS: Why did you start the list, is it just for fun or does it serve another purpose?
Strohmaier: It was fun, and also to get a handle on the market shares for supercomputers. My colleague Professor Hans Meuer started doing statistics in the late 1980s. That was the golden age of vector systems so it was easy to count supercomputers, you just counted the vector systems. Then in the early 90s when the first parallel systems were becoming important that method didn't work, so we scratched our heads and said 'What is the definition of a supercomputer?' We wanted a system that would scale over time because performance scales so quickly -- it's scaled 10,000-fold since we started the list. So we said, 'Ok let's pick a fixed number of computers that we know are supercomputers,' and there were 500 vector systems at the time, so that's why we picked the number 500.
IDGNS: Have you thought about making the list longer or shorter?
Strohmaier: We looked at growing it up to a thousand, it would be a lot more work. We also looked at if there would be a natural point to cut it off earlier. There is some indication that around 50 to 100 would be the cut-off point for really big scientific supercomputers, because after 50 to 100 you see a lot of commercial systems on the list and they tend to have different features, it's a bit of a different market. But we're happy to have 500 because what happens in the top 50 is largely driven by what happens in the 100 to 400 range, because that's where companies design products for and then they adjust them for the high end. There are very few companies who define purely for the high end.
IDGNS: Any predictions for the next list?
Strohmaier: I'm not sure we'll have a second petaflop system yet, but it will come. Quad-core systems are running strong, we'll see more of them. Aside from the petaflop system and the quad-core systems, the other interesting thing this time was that we included power consumption for the first time. We tried to learn from other efforts, we talked about this for a while, and now we have power consumption for half the systems on the list. We only list measured numbers, not peak numbers because they can be very misleading.
IDGNS: Would you ever take power efficiency into consideration when you decide who has the best-performing system?
Strohmaier: No, we rank things by size, so you need something that grows with the size of the object. Density is a feature of the product but it doesn't tell you anything about its size.
What we envision to do is maybe have an adjustment to the performance-based ranking which takes these other features of a machine into consideration, like power consumption and memory utilization. But using only power efficiency is the wrong way to go. It's an additional piece of information that is becoming important but it doesn't define what a supercomputer is.
IDGNS: Would that become a secondary list that takes power and memory consumption into consideration?
Strohmaier: It would be secondary yes. We'd always keep for tradition and for comparison the list of the biggest systems, but there might be a button on the home page where you could reorder by something like power or memory utilization, depending on what's important it is for you.