A Quantitative Evaluation of Contemporary GPU Simulation Methodology

Abstract

Contemporary Graphics Processing Units (GPUs) are used to accelerate highly parallel compute workloads. For the last decade, researchers in academia and industry have used cycle-level GPU architecture simulators to evaluate future designs. This paper performs an in-depth analysis of commonly accepted GPU simulation methodology, examining the effect both the workload and the choice of instruction set architecture have on the accuracy of a widely-used simulation infrastructure, GPGPU-Sim. We analyze numerous aspects of the architecture, validating the simulation results against real hardware. Based on a characterized set of over 1700 GPU kernels, we demonstrate that while the relative accuracy of compute-intensive workloads is high, inaccuracies in modeling the memory system result in much higher error when memory performance is critical. We then perform a case study using a recently proposed GPU architecture modification, Cache-Conscious Wavefront Scheduling. The case study demonstrates that the cross-product of workload characteristics and instruction set architecture choice can affect the predicted efficacy of the technique.

Publication
In Proceedings of the ACM on Measurement and Analysis of Computing Systems, Volume 2, Issue 2 (SIGMETRICS)
Akshay Jain
Akshay Jain
Masters Thesis. 2017.
Mahmoud Khairy
Mahmoud Khairy
PhD Graduate, 2022.
Tim Rogers
Tim Rogers
Associate Professor of ECE