Task 2777.001 Neuro-inspired Algorithms for Efficient and Lifelong Learning
|Event Date:||December 6, 2018|
|Time:||2:00 pm EST/11 am PST
Title: Overcoming catastrophic forgetting through weight consolidation and long-term memory
Abstract: Sequential learning of multiple tasks in artificial neural networks using gradient descent leads to catastrophic forgetting, whereby previously learned knowledge is erased during learning of new, disjoint knowledge. Here, we propose a new approach to sequential learning which leverages the recent discovery of adversarial examples. We use adversarial subspaces from previous tasks to enable learning of new tasks with less interference. We apply our method to sequentially learning to classify digits 0, 1, 2 (task 1), 4, 5, 6, (task 2), and 7, 8, 9 (task 3) in MNIST (disjoint MNIST task). We compare and combine our Adversarial Direction (AD) method with the recently proposed Elastic Weight Consolidation (EWC) method for sequential learning. We train each task for 20 epochs, which yields good initial performance (99.24% correct task 1 performance). After training task 2, and then task 3, both plain gradient descent (PGD) and EWC largely forget task 1 (task 1 accuracy 32.95% for PGD and 41.02% for EWC), while our combined approach (AD+EWC) still achieves 94.53% correct on task 1. We obtain similar results with a much more difficult disjoint CIFAR10 task (70.10% initial task 1 performance, 67.73% after learning tasks 2 and 3 for AD+EWC, while PGD and EWC both fall to chance level). We confirm qualitatively similar results for EMNIST with 5 tasks and under 3 variants of our approach. Our results suggest that AD+EWC can provide better sequential learning performance than either PGD or EWC.
Bio: Shixian Wen is a second year Ph.D student from Dr Itti's lab at University of Southern California. He is interested in designing algorithms from neural inspiration and wants to understand how our brains work.