Task 005/006: Neural Fabrics and Programming and Evaluation Framework

Event Date:	September 12, 2019
Time:	2:00pm ET/ 11:00am PT
Priority:	No
College Calendar:	Show

Aayush Ankit, PhD Student, Purdue University
PANTHER: A Programmable Architecture for Neuromorphic Training Harnessing Energy-efficient ReRam

Abstract:

The wide adoption of deep neural networks has been accompanied by ever-increasing energy and performance demands due to the expensive nature of training them. Numerous special-purpose architectures have been proposed to accelerate training, both digital and hybrid digital-analog using ReRAM crossbars. ReRAM-based accelerators have demonstrated the effectiveness of ReRAM crossbars at performing matrix operations that are prevalent in training. However, they still suffer from inefficiency due to the use of serial reads and writes for performing the weight gradient and update step.

A few works have demonstrated the possibility of performing outer products in crossbars, which can be used to realize the weight gradient and update step without the use of serial reads and writes. However, these works have been limited to low precision operations which are not sufficient for typical training workloads. Moreover, they have been confined to a limited set of training algorithms for fully-connect layers only.

To address these limitations, we propose a technique for enhancing the precision of ReRAM-based outer products. We incorporate this technique into a crossbar architecture with three variants catered to different training algorithms. We build an ISA-programmable training accelerator with compiler support to evaluate our design on different layer types, however, our design can also be integrated into existing accelerators in the literature to enhance their efficiency. Our evaluation shows that our design achieves up to 8.02X, 54.21X, and 2,358X energy reductions as well as 7.16X, 4.02X, and 119X execution time reductions compared to digital accelerators, ReRAM-based accelerators (using PipeLayer's approach), and GPUs respectively.

Bio:

Aayush Ankit received the B.Tech. degree from Indian Institute of Technology (BHU), Varanasi in 2015. Currently, he is pursuing PhD degree in Electrical and Computer Engineering at Purdue University and has been a research assistant to Prof. Kaushik Roy since fall 2015. His research interests lie in hardware-software codesign for efficient machine learning, and GPU architecture-compiler designs. During his PhD, he has done internships as – ML Architect at HPE Labs, Palo Alto, CA in 2017; CPU Designer at Intel Corporation, Hillsboro, OR in 2017; and GPU Architect at Samsung ACL, San Jose, CA in 2019.