ECE 59500 - Reinforcement Learning: Theory and Algorithms
Note:
Students are allowed to count this course OR ECE 49595 Introdction to Reinforcement Learning toward ECE credits. If both are taken, one will count as a Complementary Elective.
Course Details
Lecture Hours: 3 Credits: 3
Counts as:
- EE Elective
- CMPE Selective - Special Content
Normally Offered:
Each Fall
Campus/Online:
On-campus and online
Requisites:
MA 26500 or equivalent; ECE 30200 or equivalent; MA 26100 or equivalent
Requisites by Topic:
Undergraduate understanding of linear algebra, probability, calculus
Catalog Description:
This course introduces the foundations and the recent advances of reinforcement learning, an area of machine learning closely tied to optimal control that studies sequential decision-making under uncertainty. This course aims to create a deep understanding of the theoretical and algorithmic foundations of reinforcement learning while discussing the practical considerations and various extensions of reinforcement learning.
Required Text(s):
None.
Recommended Text(s):
- Bandit Algorithms , Lattimore, Tor; Szepesvari, Csaba , Cambridge University Press , 2020
- Dynamic Programming and Optimal Control , Bertsekas, Dimitri P. , Athena Scientific , 2011
- Foundations of Deep Reinforcement Learning , Graesser, Laura; Keng, Wah Loon , Addison-Wesley Professional , 2019
- Markov Decision Processes: Discrete Stochastic Dynamic Programming , Puterman, Martin L. , John Wiley & Sons , 2014
- Neuro-dynamic Programming , Bertsekas, Dimitri P.; Tsitsiklis, John N. , Athena Scientific , 1996
- Reinforcement Learning: An Introduction , Sutton, Richard S.; Barto, Andrew G. , MIT Press , 2018
- Reinforcement Learning: Theory and Algorithms , Agarwal, Alekh; Jiang, Nan; Kakade, Sham M.; Sun, Wen , 2019
Lecture Outline:
Lecture Topics | |
---|---|
1 | Introduction, motivation, overview of relevant background |
2 | Dynamic programming and policy evaluation |
3 | Policy iteration and value iteration |
4 | Monte Carlo and temporal difference methods |
5 | Computational complexity and statistical limits |
6 | Linear quadratic regulators (LQR) and optimal control |
7 | Optimal control for nonlinear systems (Iterative LQR) |
8 | Prediction, estimation, and Kalman filtering |
9 | Model-based and model-free reinforcement learning |
10 | Approximate policy iteration and deep Q-learning |
11 | Conservative policy iteration and trust region methods |
12 | Stochastic gradient descent and policy gradient |
13 | Exploration in reinforcement learning and multi-armed bandits |
14 | Partially observable Markov decision processes and risk-averse reinforcement learning |
15 | Inverse reinforcement learning, meta-learning, transfer learning, and multi-agent reinforcement learning |
Assessment Method:
Homework, projects, exams (3/2023)