How the cloud and big compute are remaking HPC

High-performance computing projects require massive quantities of compute resources. Pairing simulation and specialized hardware with the cloud powers the breakthroughs of the future.

How the cloud and big compute are remaking HPC
Denis Isakov / Getty Images

Roughly 25 years ago, a few open source technologies combined to make a robust, commercial Internet that was finally ready to do business and take your money. Dubbed the LAMP stack (Linux, Apache HTTP Server, MySQL, and PHP/Perl/Python), this open source combination became the standard development stack for a generation of developers.

Don’t look now, but we may well be on the cusp of another LAMP stack moment.

This time, however, the focus isn’t on building a new, online way to peddle dog food. Instead, a new technology renaissance is underway to tackle algorithmically complex, large-scale workloads that consume massive quantities of compute resources. Think vaccinations for COVID-19, building new supersonic jets, or driving autonomous vehicles. The science and engineering world is shipping faster and delivering newer innovations at a dizzying pace never witnessed before.

How? Cloud. But not just cloud.

The dawn of ‘big compute’ or ‘deep tech’

Cloud is perhaps too facile a description for what is happening. We lack a clever shorthand for this transformation, like a LAMP stack for the Internet. Something has suddenly freed PhD types to innovate on computing engines of immense complexity to power algorithmically driven workloads that are changing our lives in much deeper ways than an early Friendster or promised to deliver.

“High-performance computing” (HPC) is the most common tag associated with these workloads. But that was before public clouds became viable platforms for these new applications. Scan the Top500 list of the world’s fastest supercomputers and you’ll see a growing number based on public clouds. This isn’t a coincidence: On-premises supercomputers and massive Linux clusters have been around for decades (preceding the commercial Internet), but this new trend—sometimes dubbed “big compute” or “deep tech”—depends heavily on cloud.

As consulting firm BCG puts it, “The increasing power and falling cost of computing and the rise of technology platforms are the most important contributors. Cloud computing is steadily improving performance and expanding breadth of use.”

But this new “stack” isn’t just about cloud. Instead, it depends on three megatrends in technology: rapidly increasing breadth and depth of simulation software, specialized hardware, and cloud. These are the technology building blocks that every fast-moving research and science team is leveraging today and why hundreds of startups have emerged to shake up long-moribund industries that had consolidated a decade or more ago.

Helping engineers move faster

Just like the LAMP stack magical moment, today’s big compute/deep tech moment is all about enabling engineering productivity. Cloud is critical to this, though it’s not sufficient on its own.

Take aerospace, for example. An aerospace engineer would traditionally depend on an on-premises HPC cluster to simulate all the necessary variables related to liftoff and landing to design a new supersonic jet. Startup aerospace companies, by contrast, went straight to the cloud, with elastic infrastructure that has enabled them to model and simulate applications without queuing up behind colleagues for highly specialized HPC hardware. Less time building and maintaining hardware. More time experimenting and engineering. That’s the beauty of the big compute cloud approach.

Couple that with a diverse array of simulation software that enables new innovations to be modeled before complex physical things are actually built and prototyped. Specialized hardware, as Moore’s Law runs out of gas, power these algorithmically complicated simulations. And the cloud jail-breaks all of this from on-premises supercomputers and clusters, making it an order of magnitude easier to create and run models, iterate and improve, and run them again before moving to physical prototypes. (To be clear, much of this big compute/deep tech is about building physical things, not software.)

What’s tricky about this domain is the custom hardware and software configurations that are required to make them run and the sophisticated workflows required to optimize their performance. These types of algorithmically intensive workloads require increasingly specialized GPU and other newer chip architectures. Companies that are paying expensive PhDs to design the next great turbine or jet propulsion secret sauce don’t want to bog them down by forcing them to learn how to stand up machines with simulation software and hardware combinations.

“Fifteen years ago, any company in this HPC domain differentiated itself based on how well it ran its hardware on-premises, and basically placed a bet that Moore’s Law would continue to deliver consistently better performance on x86 architectures year over year,” said Joris Poort, CEO at Rescale, in an interview. “Today what matters most is speed and flexibility—making sure that your PhDs are using the best simulation software for their work, freeing them from becoming specialists in specialized big compute infrastructure so they can ship new innovations faster.”

Specialized supercomputers

Will every company eventually use simulation and specialized hardware in the cloud? Probably not. Today this is the domain of rockets, propulsion, computational biology, transportation systems, and the upper 1% of the world’s hardest computational challenges. But while big compute is used to crack the geekiest of problems today, we will most certainly see a new wave of Netflixes that topple the Blockbusters of the world using this LAMP stack combination of cloud, simulation software, and specialized hardware.

Copyright © 2021 IDG Communications, Inc.