ECE 59500 - Reinforcement Learning: Theory and Algorithms

Note:

Students are allowed to count this course OR ECE 49595 Introdction to Reinforcement Learning toward ECE credits. If both are taken, one will count as a Complementary Elective.

Course Details

Lecture Hours: 3 Credits: 3

Counts as:

  • EE Elective
  • CMPE Selective - Special Content

Normally Offered:

Each Fall

Campus/Online:

On-campus and online

Requisites:

MA 26500 or equivalent; ECE 30200 or equivalent; MA 26100 or equivalent

Requisites by Topic:

Undergraduate understanding of linear algebra, probability, calculus

Catalog Description:

This course introduces the foundations and the recent advances of reinforcement learning, an area of machine learning closely tied to optimal control that studies sequential decision-making under uncertainty. This course aims to create a deep understanding of the theoretical and algorithmic foundations of reinforcement learning while discussing the practical considerations and various extensions of reinforcement learning.

Required Text(s):

None.

Recommended Text(s):

  1. Bandit Algorithms , Lattimore, Tor; Szepesvari, Csaba , Cambridge University Press , 2020
  2. Dynamic Programming and Optimal Control , Bertsekas, Dimitri P. , Athena Scientific , 2011
  3. Foundations of Deep Reinforcement Learning , Graesser, Laura; Keng, Wah Loon , Addison-Wesley Professional , 2019
  4. Markov Decision Processes: Discrete Stochastic Dynamic Programming , Puterman, Martin L. , John Wiley & Sons , 2014
  5. Neuro-dynamic Programming , Bertsekas, Dimitri P.; Tsitsiklis, John N. , Athena Scientific , 1996
  6. Reinforcement Learning: An Introduction , Sutton, Richard S.; Barto, Andrew G. , MIT Press , 2018
  7. Reinforcement Learning: Theory and Algorithms , Agarwal, Alekh; Jiang, Nan; Kakade, Sham M.; Sun, Wen , 2019

Lecture Outline:

Lecture Topics
1 Introduction, motivation, overview of relevant background
2 Dynamic programming and policy evaluation
3 Policy iteration and value iteration
4 Monte Carlo and temporal difference methods
5 Computational complexity and statistical limits
6 Linear quadratic regulators (LQR) and optimal control
7 Optimal control for nonlinear systems (Iterative LQR)
8 Prediction, estimation, and Kalman filtering
9 Model-based and model-free reinforcement learning
10 Approximate policy iteration and deep Q-learning
11 Conservative policy iteration and trust region methods
12 Stochastic gradient descent and policy gradient
13 Exploration in reinforcement learning and multi-armed bandits
14 Partially observable Markov decision processes and risk-averse reinforcement learning
15 Inverse reinforcement learning, meta-learning, transfer learning, and multi-agent reinforcement learning

Assessment Method:

Homework, projects, exams (3/2023)