POSTER: Pagoda: A Runtime System to Maximize GPU Utilization in Data Parallel Tasks with Limited Parallelism

Tsung Tai Yeh, Amit Sabne, Putt Sakdhnagol, Rudolf Eigenmann, Tim Rogers

September, 2016

Abstract

Massively multithreaded GPUs achieve high throughput by running thousands of threads in parallel. To fully utilize the hardware, contemporary workloads spawn work to the GPU in bulk by launching large tasks, where each task is a kernel that contains thousands of threads that occupy the entire GPU.

Type

Conference paper

Publication

In 2016 International Conference on Parallel Architecture and Compilation Techniques (PACT)

POSTER: Pagoda: A Runtime System to Maximize GPU Utilization in Data Parallel Tasks with Limited Parallelism

Abstract

Tsung Tai Yeh

PhD Graduate, 2020.

Tim Rogers

Associate Professor of ECE