Opening the Black Box: Towards Theoretical Understanding of Deep Learning

Event Date: March 4, 2021
Time: 11:00 am
Location: via Zoom
Priority: No
School or Program: Electrical and Computer Engineering
College Calendar: Show
Wei Hu
PhD Candidate
Princeton University

Join us online!

https://purdue-edu.zoom.us/j/93549906633 

Abstract
Despite the phenomenal empirical successes of deep learning in many application domains, its underlying mathematical mechanisms remain poorly understood. Mysteriously, deep neural networks in practice can often fit training data perfectly and generalize remarkably well to unseen test data, despite highly non-convex optimization landscapes and significant over-parameterization. Moreover, deep neural networks show extraordinary ability to perform representation learning: feature representation extracted from a trained neural network can be useful for other related tasks.
 
In this talk, I will present our recent progress on building the theoretical foundations of deep learning, by opening the black box of the interactions among data, model architecture, and training algorithm. First, I will show that gradient descent on deep linear neural networks induces an implicit regularization effect towards low rank, which explains the surprising generalization behavior of deep linear networks for the low-rank matrix completion problem. Next, turning to nonlinear deep neural networks, I will talk about a line of studies on wide neural networks, where by drawing a connection to the neural tangent kernels, we can answer various questions such as how training loss is minimized, why trained network can generalize, and why certain component in the network architecture is useful; we also use theoretical insights to design a new simple and effective method for training on noisily labeled datasets. Finally, I will analyze the statistical aspect of representation learning, and identify key data conditions that enable efficient use of training data, bypassing a known hurdle in the i.i.d. tasks setting.
 
Bio
Wei Hu is a PhD candidate in the Department of Computer Science at Princeton University, advised by Sanjeev Arora. Previously, he obtained his B.E. in Computer Science from Tsinghua University. He has also spent time as a research intern at research labs of Google and Microsoft. His current research interest is broadly in the theoretical foundations of modern machine learning. In particular, his main focus is on obtaining solid theoretical understanding of deep learning, as well as using theoretical insights to design practical and principled machine learning methods. He is a recipient of the Siebel Scholarship Class of 2021.
 
Host
Prof. David Inouye, dinouye@purdue.edu, 765-496-0238

2021-03-04 11:00:00 2021-03-04 12:00:00 America/Indiana/Indianapolis Opening the Black Box: Towards Theoretical Understanding of Deep Learning Wei Hu PhD Candidate Princeton University via Zoom