Machine Learning I

An introductory course to machine learning, with a focus on supervised learning using linear models. The course will have four parts: (1) mathematical background on linear algebra, probability, and optimization. (2) classification methods including Bayesian decision, linear regression, logistic, regression, and support vector machine. (3) robustness of classifier and adversarial examples. (4) learning theory on the feasibility of learning, VC dimension, complexity analysis, bias-variance analysis. Suitable for senior undergraduates and graduates with a background in probability, linear algebra, and programming.


Credit Hours:


Learning Objective:

  • Apply basic linear algebra, probability, and optimization tools to solve machine learning problems
  • Understand the principles of supervised learning methodologies, and can comment on their advantages and limitations
  • Explain the trade-offs in model complexity, sample complexity, bias, variance, and generalization error in the learning theory
  • Implement, debug, and execute basic machine learning algorithms on computers 


Machine Learning is a dual-level (500-level) course customized for science and engineering students who are seeking to develop a solid mathematical foundation of the subject. The course focuses on the fundamental principles of machine learning instead of a scattered set of algorithmic tools. Students completing the course will be able to formulate the practical machine learning problems into mathematical frameworks, implementing algorithms to execute the statistical inference tasks, and analyzing the performance and limitations of the algorithms. The course emphasizes the co-development of theory and programming. Students will have a hands-on experience implementing machine learning algorithms in Python.

Fall 2023 Syllabus

Topics Covered:

  1. Linear regression, which covers regression models, outliers, ridge regularization, LASSO regularization, convex optimization, gradient descent algorithms, and stochastic algorithms
  2. Classification, which covers separability, Bayesian classifiers, ROC curves, precision-recall curves, logistic regression, and kernel methods
  3. Learning theory, which covers probability inequality, the probably approximately accurate framework, generalization bound, model complexity, sample complexity, VC dimension, bias, variance, overfitting, and validation
  4. Advanced topics of the state-of-the-arts, for example deep neural networks, generative models, and adversarial robustness 


The following courses are highly recommended to students planning to take this course:

  • Linear Algebra (as in the materials covered by G. Strang's Linear Algebra Textbook) A good course at Purdue is MA 511 Linear Algebra
  • Optimization (as in chapter 1 - chapter 4 of S. Boyd's Convex Optimization)
  • Probability (as in the materials covered by D. Bertseka's Intro to Probability Textbook) ECE 600ir recommended but not required

Web Address:


Homework are given approximately biweekly. Late homework will not be accepted. You are encouraged to work in small groups, but you have to write/type your own solution. Worst homework will be dropped. 


Project worth 50% 


There will be 6 quizzes throughout the semester. The quizzes shall be taken on gradescope. Each quiz will be 30 minutes long. Worst quiz will be dropped. 


Required Textbook:

  • Introduction to Probability for Data Science, by Stanley Chan, Michigan Publishing, 2021
  • Learning form Data, by Abu-Mostafa, Magdon-Ismail and Lin, AMLBook, 2012

Recommended textbook:

  • Pattern Classification, by Duda, Hart and Stork, Wiley-Interscience; 2 edition, 2000
  • Elements of Statistical Learning, by Hastie, Tibshirani, and Friedman, Springer, 2 edition, 2009
  • Pattern Recognition and Machine Learning, by Bishop, Springer, 2006 

ProEd Minimum Requirements: