Raise Your Hand and Ask: What’s PGAS?

istock 523399066
Credit: istock

Note: Most people don’t want to be the uncool one to raise their hand and ask a question, but in many cases we really should. These occasional “Raise Your Hand and Ask” posts highlight cool “buzzwords” you may have heard. My aim isn’t just to explain what they mean (that you can look up), but also why they matter.

What does “PGAS” mean – and why should I care?

 The short answer: Partitioned Global Address Space. The longer answer: a way to program as if there is “shared memory” between threads of a program (when there is not) in a way that allows for good performance.

PGAS is hardly new; I’ve seen PGAS supporters promoting it for decades now. But I’m ready to predict that PGAS is poised to become more popular in the next few years. PGAS either jumps in popularity in the next 10 years or we can declare it dead because the hardware it really needs is finally appearing.

Imagine five computers sharing an array

Imagine computers connected by cables. Normally, programs run on a single computer and use memory for the programs only on that computer. Communication between the computers is done with messages. The concept of PGAS is to have a way to access memory on any computer from any computer. Since the hardware will not really allow normal memory accessing to do this, the compiler of a PGAS language translates reads into request-return messages, or translates writes into messages to deliver data.

reinders reinders

A typical PGAS might have an array MYDATA that is split across our five computers.  It would look like an array MYDATA[X][Y] but the Y would indicate which system the data is on.  On system #1, MYDATA[10][1] = MYDATA[10][2] + MYDATA[10][3] + MYDATA[10][4] + MYDATA[10][5] would read four numbers from the other systems, add them, and save the result to a local part of the array (which can be read by any system).

Imagine that MYDATA is declared to have 500 elements, 100 in the first subscript times five computers. We can think of that as five copies of 100 elements, or 500 elements distributed across five computers. (PGAS does not dictate how we think of it or use it.)

PGAS is for clustered computers

My diagram shows computers that are fairly loosely connected. This is not really the target of PGAS systems because the time for messages between the computers is very long (high latency). However, that is the only issue and may diminish over time. PGAS will work functionally on any collection of connected computers. I personally like to think of it this way when programming, but run it on a system with much better performance than the system in my diagram.

The real domain for PGAS is what we call clusters, including the really big clusters known as supercomputers. The interconnects in clusters are racing to offer lower and lower latency. PGAS languages have become tolerable or better on today’s clusters, and many expect advances in the next few years to make PGAS languages viable for a broader class of applications than ever before.

PGAS encourages “shared memory programming”

It’s a double-edged sword that PGAS creates the familiar and preferred “shared memory” programming model. Shared memory programming is perceived as easier to learn and program in, but shared-mutable state is really the root of many parallel programming bugs and performance issues. In general, all distributed memory applications on supercomputers use messages via a library call MPI (message passing interfaces). But, the pressure to support PGAS is high, and even the latest MPI standard includes extensions that are really PGAS as well.

Try it yourself

C and C++ programmers can look to UPC as an extension for PGAS support.  One quick way to get started is the Gnu UPC extensions to gcc. The Fortran 2008 standard offers “Coarrays,” which are supported by the Intel, Gnu, and Cray compilers currently. I have published some examples of PGAS code online to show what it looks like in more detail.

Additional information is available:


PGAS is definitely worth another look, as it’s poised to gain in popularity in the upcoming years. After years of theoretical interest, three factors make PGAS finally practical for widespread usage: (1) large numbers of cores connected coherently, including the Intel Xeon Phi products; (2) low-latency interconnects to bridge between processor/memory systems; (3) growing software and standards support for PGAS.  PGAS won’t take over the world, but it will take a seat as a key and important programming method for many applications in the future.

Click here to download your free 30-day trial of Intel Parallel Studio XE