Announcements
11/3/19 The solutions for midterm 1 are available here.
8/8/19 Website is live!
Course Description
This course is an introductory programming course that teaches Python. But in addition to that, it provides an introduction to topics in data science. Topics covered include:
- Basics of git
- Regular expressions and text processing
- Python basics
- Python data structures and libraries
- Basic object-oriented programming
- Basic data visualization
- Sampling, estimation, hypothesis testing
- Regression analyses
- Classification and clustering
- Basic neural networks
We will have Python programming assignments roughly every week (10 in all), plus a mini-project at the end of the semester.
Prerequisites: Undergraduate level CS 15900 Minimum Grade of C-
Course Details
The syllabus for the course explains the logistical details of the course. The course also uses a Piazza discussion board for course questions.
Lecture Notes
- Week 1:
- 8/20 Intro. Please also see the notes on git and GitHub.
- 8/22 Bash. You can find the files that we used for the examples in class here
- Week 2:
- 8/27 Python basics. There are slides and code. You can also download the Jupyter Notebook associated with the code if you want to play around with the code
- 8/30 Data structures. Code and notebook. Also see the slides from 8/27.
- Week 3:
- 9/3 Histograms.
- 9/5 Probability and Distribution.
- Week 4:
- 9/10 Probability and Distrubiton (continued). Higher Order Functions
- 9/12 Higher Order Functions continued. See the code and notebook associated with this material.
- Week 5:
- Week 6:
- 9/24 More Hypothesis Testing (note that slides have been updated to include material on one-sided tests)
- 9/26 Midterm Review
- Week 7:
- 10/1 Regular Expressions.
- 10/3 Regular Expressions (continued)
- Week 8:
- 10/8 Fall Break. No class.
- 10/10 Regression
- Week 9:
- 10/15 Regression (continued). We also did a brief overview of linear algebra, and discussed NumPy (Associated notebook)
- 10/17 Regression (continued). Note that the regression notes are now updated with all of the material covered across the three regression lectures. You may also find this notebook walking through a regression computation useful.
- Week 10:
- 10/22 Basic Natural Language Processing. You may also find this notebook walking through building a document-word matrix helpful.
- 10/24 No class (Midterm 1 makeup)
- Week 11:
- 10/29 Classes and Objects and Clustering
- 10/31 Classes and Objects, and Cluster, continued.
- Week 12:
- 11/5 Midterm 2 review
- 11/7 Clustering continued, and Inheritance
- Week 13:
- 11/12 Classification: Naive Bayes and k-Nearest Neighbor.
- 11/14 Iterators and Generators, with associated notebook.
- Week 14:
- 11/19 Classification: Logistic Regression.
- 11/21 Perceptrons, plus some supplemental notes discussing convergence.
- Week 15:
- 11/26 No class (Midterm 2 makeup)
- 11/28 No class (Thanksgiving)
- Week 16:
- 12/3 Neural nets and back propagation. Note that these notes include updates to the notes from 11/21.
Assignments
- No longer available