"CPUs have a small number of cores, they are big, they are complex and they are brilliant making a single task or a small number of tasks run fast. GPUs have hundreds of really tiny, power-efficient cores" that are throughput-optimized and power-efficient, Scott said.
A more distributed computing model needs to be adopted to scale performance with transactions being executed in parallel across CPUs and GPUs, Scott said. About 90 percent of the processing in Titan will be on GPUs and some residual serial code left over will be processed on CPUs, Scott said. The estimated energy bill for Titan will be US$9 million a year, while a CPU-only Titan at 20 petaflops would have had an energy bill of roughly $60 million a year, according to ORNL and Nvidia estimates.
Combining Tesla with a 64-bit ARM processor is a good idea, said Jim McGregor, principal analyst at Tirias Research. ARM processors with 64-bit address have a larger memory ceiling than current 32-bit ARM processors, which have a limited memory ceiling of only 4GB, which is not enough for supercomputing.
"High performance on 32-bit just isn't happening," McGregor said. "It makes absolute sense if you have a 64-bit."
Future Tesla chips could be massively parallel with a large number of CPU and GPU cores. The chips could be useful in hybrid computing models where some processing is done locally and some in the cloud, McGregor said. ARM cores provide an efficient mix of power and performance, are efficient at handling data traffic, and could be used for data mining or financial transactions.
But faster x86 CPUs from Intel and AMD may be needed for complex scientific calculations, McGregor said.
Server architectures have changed over the years and cores should be adopted "to meet the size of the data," McGregor said, adding that software support is also important.