July 11, 2019

Purdue and Lawrence Livermore researchers show how to reliably approximate computation on GPUs

Researchers in Purdue University's School of Electrical and Computer Engineering are working on ways to approximate computation on Graphics Processing Units (GPUs) so they run fast while keeping errors in the results within tolerable limits. Saurabh Bagchi, professor of electrical and computer engineering at Purdue, is collaborating with a team from the US Department of Energy’s Lawrence Livermore National Lab (LLNL) to solve this problem.
Dr. Ignacio Laguna receiving the best paper award
Computer Science Researcher, Dr. Ignacio Laguna (center), from US Department of Energy’s Lawrence Livermore National Laboratory (LLNL) receiving the best paper award for the joint work between Purdue and LLNL at the International Supercomputing Conference (ISC) in Frankfurt, Germany held on June 16-20, 2019

Researchers in Purdue University's School of Electrical and Computer Engineering are working on ways to approximate computation on Graphics Processing Units (GPUs) so they run fast while keeping errors in the results within tolerable limits. Saurabh Bagchi, professor of electrical and computer engineering at Purdue, is collaborating with a team from the US Department of Energy’s Lawrence Livermore National Lab (LLNL) to solve this problem. Computation on GPUs can be energy hungry thereby limiting their use in scenarios like Internet of Things (IoT) systems. One promising approach in computing has been approximate computing, in which some parts of computation are approximated so they can run faster or with less energy, but at the cost of incurring some error. Before this work, it had not been known how to do approximate computation while bounding errors in GPUs.

"Being able to make systematic trade-offs in the accuracy versus reliability space becomes increasingly important as we seek to execute applications at large scales driven by increases in problem size," says Bagchi. "The opportunities for this in GPUs are high since they have instructions with different degrees of precision and these take different execution times."

The project team at LLNL is led by Dr. Ignacio Laguna.

“The demonstration of this approach on large GPU clusters at LLNL and on problems that we care about is hugely significant,” says Laguna.

The research is being funded through a contract from LLNL and was recently highlighted in two conference presentations. The first was at the 33rd ISC High Performance conference held June 16-20, 2019 in Frankfurt and which brought together over 3,500 researchers and commercial users. The work titled “GPUMixer: Performance-Driven Floating-Point Tuning for GPU Scientific Applications” won the Hans Meuer Award, the best paper award at the conference. The second aspect of the work was presented at the 33rd International Conference on Supercomputing (ICS), held on June 26-28 at the Phoenix Convention Center, as part of ACM’s Federated Computing Research Conference (FCRC) which was attended by over 2,800 people. The project team includes Purdue students, Pradeep Kotipalli and Ranvijay Singh (now at NVIDIA), and Research Scientist, Dr. Paul Wood (now at Johns Hopkins University).

The work presents a novel technique for estimating the reduction in accuracy due to a specific combination of precisions for the variables in the program. It does this through a static analysis of the program. Then at runtime, it performs an efficient search through the large search space of possible precisions to determine the optimal one. The search itself would take years without the technical contribution of this project due to the large number of variables and combinations of precisions that are possible.

Bagchi says research on approximate computation with reliability guarantees is needed and is in its nascent stage. This specific project is now applying the techniques to further realistic programs of relevance to US DOE.

Videos of the authors of the work explaining the significance of the discovery can be found at:

Share