OMEN Scaling on 222,720 cores
OMEN can require significant computational resources and consume them very efficiently. We were able for the first time to run on 222,720 processors in a 4 level MPI and MPI/OpenMP hybrid parallelism.
In April 2008 we obtained almost perfect scaling to 32,768 cores on Ranger@TACC. At that time we were limited to scale to higher cores by the system software on that machine that could not start more than 32,768 MPI threads. We therefore began to develop a hybrid implementation where the two lowest levels of parallelism, domain decomposition and energies, can be dispatched in parallel OpenMP threads.
In October 2008 we obtained almost perfect scaling of OMEN up to 59,904 cores. The scaling can be obtined in a pure 4-level MPI paralelism (the system software was updated to allow that now) as well as a hybrid MPI/OpenMP 5-level parallelism.
In January of 2009 another Track 2 system became availabe at the University of Tennessee and Oak Ridge . On that Cray XT4 machine we were able to scale to the available 65,536 cores.
In June of 2009 we were able to scale on the full Jaguar XT5 (dual quad core AMD processors) at the Oak Ridge National Lab to 147,456 cores delivering 500 TFlops.
In November of 2009 we were able to scale on the newly upgraded Jaguar XT5 (dual hex-core AMD processors) at the Oak Ridge National Lab to 222,720 cores, delivering 860 TFlops.
scaling to 222,720 cores on Jaguar@Oak Ridge
scaling to 147,456 cores on Jaguar@Oak Ridge
scaling to 65,536 cores on Kraken@UTK
scaling to 59,904 cores on Ranger@TACC
scaling to 32,768 cores on Ranger@TACC
scaling to 18k cores on Kraken and Ranger