Programming Parallel Machines
Learning Objective:After taking this course, a student should be able to (1) Having a basic understanding of computer architecture and how it affects parallel program performance, (2) write a parallel program using OpenMP, Pthreads, MPI and Cuda, (3) Parallelize a program using the the tools of (2) to run in parallel on a multicore machine, a distributed memory multi-node machine and an accelerator, (3) Computer the efficiency and performance of a parallel program and measure its scalability.
This course will allow students to do parallel computations in a variety of engineering fields and the data sciences by presenting methods and techniques for programming parallel computers, such as multicore and high-end parallel architectures. A short introduction to computer architecture and general parallelization concepts will be presented. Parallel architectures to be considered are shared-memory and distributed-memory multiprocessor systems and accelerators such as Graphics Processing Units (GPUs). Programming paradigms for these machines will be compared, including directive-based (OpenMP), thread-based (Posix threads), message passing (MPI) and a language targeting accelerators. Methodologies for analyzing and improving the performance of parallel programs will be discussed. There will be a class project in which each student develops and tunes a parallel application.