Task 004: A 16-bit Fixed-Point Convolutional Neural Network Learning Processor in 65nm CMOS

Event Date:	August 29, 2019
Time:	2:00pm ET/ 11:00am PT
Priority:	No
School or Program:	Electrical and Computer Engineering
College Calendar:	Show

Shihui Yin, Arizona State University

Abstract:

With the advent of artificial intelligence, various deep neural networks (DNNs) such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs) have emerged and achieved human-level performance in image/speech recognition tasks. DNNs are typically trained by GPUs in 32-bit floating-point (FP32) precision after sending all training data to a central server. To train DNNs with a user’s sensitive data, local training capability on mobile or edge devices is desired due to privacy concerns. DNN inference accelerators have progressively optimized performance and energy efficiency, but inference-only DNN accelerators cannot adapt to the evolving user-centric application scenarios. Since DNN training is considerably more compute-/memory-intensive than inference, it is challenging to perform DNN training on resource-limited mobile devices. In particular, back-propagation (BP) based training algorithm involves weight transpose for convolution and fully-connected (FC) layers. Maintaining two copies of weights (original and transposed) or employing transposable SRAM incurs large area overhead or custom SRAM bitcell design. In this talk we present a convolutional neural network (CNN) learning processor, which accelerates stochastic gradient descent (SGD) with momentum based training algorithm in 16-bit fixed-point precision. Using a novel cyclic weight storage and access scheme, we use the same off-the-shelf SRAMs for non-transpose and transpose operations during feed-forward (FF) and feed-backward (FB) phases, respectively, of the CNN learning process. The 65nm CNN learning processor achieves peak energy efficiency of 2.6 TOPS/W for 16-bit fixed-point operations, consuming 10.45 mW at 0.55V.

Bio:

Shihui Yin received the B.S. degree in microelectronics from Peking University, Beijing, China, in 2013, and the M.S. degree in electrical engineering from Carnegie Mellon University, Pittsburgh, PA, USA, in 2015. Currently, he is currently working towards the Ph.D. degree in the School of Electrical, Computer and Energy Engineering at Arizona State University, Tempe, AZ, USA. His research interest includes low power biomedical circuit and system design, and energy-efficient hardware design for machine learning and neuromorphic computing. Mr. Yin was a recipient of University Graduate Fellowship from Arizona State University in 2015 and IEEE Phoenix Section Student Scholarship for the year 2016.