Data Science for Smart Cities

CE 56401

Credit Hours:

3

Instructor:

Professor Satish V. Ukkusuri

Learning Objectives:

  • Classify and appreciate the complexity of data generated by smart cities.
  • Understand the foundations of data science methods.
  • Apply the basics of various data mining techniques.
  • Map the data mining tool that is appropriate for various smart city applications.
  • Code, apply and solve the data mining algorithms using Python.
  • Interpret the results from the data mining tools and make connections to policy making as they relate to smart cities applications.

Description:

The availability of low cost and ubiquitous sensors in city infrastructure provides high granular data at unprecedented spatiotemporal scales.  “Smart Cities” envision to utilize this data to provide a healthy, happy and sustainable urban ecosystem by integrating the information and communication technology (ICT), Internet of things (IoT) and citizen participation to effectively manage and utilize city's infrastructure and services.  “Data Science” is an interdisciplinary field of scientific methods, processes, algorithms and systems to extract knowledge from data in various forms and provides fast and efficient understanding of the current dynamics of cities and ways to improve different services. This course will introduce scientific techniques that will allow the analysis, inference and prediction of large-scale data (e.g. GPS vehicular data, social media data, mobile phone data, individual social network data etc.) that are present in city networks. Basics of the data science methods to analyze these datasets will be presented. The course will focus both on the methods and their application to smart-city problems. Python will be used to demonstrate the application of each method on datasets available to the instructor. Examples of problems that will be discussed include: ridesharing platforms, smart and energy efficient buildings, evacuation modeling, decision making during extreme events & urban resilience.

Topics Covered:

Introduction to data mining, data mining tasks, advanced data mining techniques, and various applications in smart cities

Prerequisites:

Undergraduate calculus and basic knowledge of statistical analysis

Applied / Theory:

Statistical methods, Optimization, Data pre-processing, Association rule mining, Regression methods, Classification methods, Clustering algorithms, Anomaly detection techniques, Testing methods, Neural networks, Deep learning, and Hidden Markov models (HMMs) 

Web Address:

https://purdue.brightspace.com/d2l/login

Web Content:

Syllabus, grades, lecture notes, homework assignments, solutions, quizzes, exam, and project

Homework:

Problem sets will be given, and the analysis of these assignments will be the basis for some class discussion. Problem sets are due at the beginning of class on designated days. For the problem sets, you may (are encouraged to) discuss with other students but the final written solution should be your own work.  The exam will be open class notes. 

Projects:

Students are expected to work in groups of two on a problem related to data science application or algorithm implementation.

Exams:

There will be one in-class (virtual) examination in which you will be tested on readings and materials/discussions covered in class. 

Textbooks:

Required readings will be posted on a Box folder that will be shared with students.

Official textbook information is now listed in the Schedule of Classes. NOTE: Textbook information is subject to be changed at any time at the discretion of the faculty member. If you have questions or concerns please contact the academic department.

Tentative Textbook Listing:

N/A

Computer Requirements:

Python (interpreted high-level programming language for general-purpose programming) and its libraries for data mining (numpy, scipy, matplotlib, etc.). Available for free download at https://www.python.org/. Details will be provided in class.