Analyzing Machine Learning Workloads Using a Detailed GPU Simulator

Jonathan Lew, Deval Shah, Suchita Pati, Shaylin Cattell, Mengchi Zhang, Amruth Sandhupatla, Christopher Ng, Negar Goli, Matthew D. Sinclair, Tim Rogers, Tor M. Aamodt

March, 2019

Abstract

Machine learning (ML) has recently emerged as an important application driving future architecture design. Traditionally, architecture research has used detailed simulators to model and measure the impact of proposed changes. However, current open-source, publicly available simulators lack support for running a full ML stack like PyTorch. High-confidence, cycle-accurate simulations are crucial for architecture research and without them, it is difficult to rapidly prototype new ideas. In this paper, we describe changes we made to GPGPU-Sim, a popular, widely used GPU simulator, to run ML applications that use cuDNN and PyTorch, two widely used frameworks for running Deep Neural Networks (DNNs). This work has the potential to enable significant microarchitectural research into GPUs for DNNs. Our results show that the modified simulator, which has been made publicly available with this paper 1 Source code available at https://github.com/gpgpu-sim/gpgpu-sim_distribution (dev branch), provides execution time results within 18% of real hardware. We further use it to study other ML workloads and demonstrate how the simulator identifies opportunities for architectural optimization that prior tools are unable to provide.

Type

Poster-Conference

Publication

In 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)

Analyzing Machine Learning Workloads Using a Detailed GPU Simulator

Abstract

Mengchi Zhang

PhD Graduate, 2022.

Tim Rogers

Associate Professor of ECE