Single-Net Continual Learning with Progressive Segmented Training (PST)

Event Date: June 13, 2019
Time: 2:00pm ET/ 11:00am PT
Priority: No
College Calendar: Show
Task 2777.005/006: Single-Net Continual Learning with Progressive Segmented Training (PST)


There is an increasing need for continual learning in dynamic systems, such as the self-driving vehicle, the surveillance drone, and the robotic system. Such a system requires learning from the data stream, training the model to preserve previous information and adapt to a new task, and generating a single-headed vector for future inference. Meanwhile, these dynamic systems usually have tight computation and memory budget. To satisfy the aforementioned requirements of a dynamic system, we propose Single-Net Continual Learning with Progressive Segmented Training (PST).  Different from previous approaches with dynamic structures, this work focuses on a single network and model segmentation to prevent catastrophic forgetting. Leveraging the redundant capacity of a single network, model parameters for each task are separated into two groups: one important group which is frozen to preserve current knowledge, and secondary group to be saved (not pruned) for a future learning. A fixed-size memory containing a small amount of previously seen data is further adopted to assist the training. Without additional regularization, the simple yet effective approach of PST successfully incorporates multiple tasks and achieves the state-of-the-art accuracy in the single-head evaluation on CIFAR-10 and CIFAR-100 datasets. Moreover, the segmented training significantly improves computation efficiency in continual learning.


Xiaocong Du received her B.S. degree in control engineering from Shandong University, Jinan, China, in 2014, and the M.S. degree in electrical and computer engineering from University of Pittsburgh, Pittsburgh, PA, US, in 2016. Currently, she is pursuing her Ph.D. degree in electrical engineering at Arizona State University, Tempe, AZ, USA. Her research interests include efficient algorithm and hardware co-design for deep learning, neural architecture search, continual learning, and bio-inspired neural computing.