Preparing IEs for data-driven decision making

Photo of undergrad researcher Lingjun Chen
Undergrad researcher Lingjun Chen (Aggarwal group) works on obtaining training and test data for face videos to predict vital signs.
(Photo/Vaneet Aggarwal)
As part of Purdue IE's interdisciplinary curriculum, the School currently offers eight Data Science courses to prepare our students for 21st-century engineering.

"Data science is an interdisciplinary approach to the collection and analysis of data collected from various types of systems in order to improve decision-making," says Dr. Patrick Brunese, IE Director of Academic Programs. "Improvements in the collection and storage of data via small-scale sensors and wireless technology have increased the ability of enterprises to learn from their systems to create value and remove waste."

Next spring, "Statistical Learning" (IE 490) will be offered for the first time. It will be taught by Dr. Vaneet Aggarwal, who describes it as an introductory-level course in supervised learning with a focus on regression and classification methods. The syllabus includes: linear and polynomial regression, logistic regression and linear discriminant analysis; cross-validation and the bootstrap; model selection and regularization methods (ridge and lasso); nonlinear models, splines and generalized additive models; tree-based methods, random forests and boosting; and support-vector machines. He will also include some unsupervised learning methods such as principal components and clustering (k-means and hierarchical).

Last year Aggarwal taught "Mathematics for Data Science" (IE 690). "This course covers the mathematical aspects of data science, including aspects of clustering, compression, classification, and data completion, along with big-data algorithms like alternating minimization and ADMM," he says. "It gives technical knowledge of many recent concepts in data science which would be useful in further fundamental research in the area."

Also, Aggarwal's Cloud Computing, Machine Learning and Networking Research (CLAN) Lab involves undergraduate students in research on machine learning topics. "One of the recent projects our group is working with four undergrads on is to use supervised machine learning to predict health metrics using face video," describes Aggarwal. "In this project, students learn the state of the art in supervised learning, and try to improve the performance of the algorithms for vital sign prediction including heart rate and stress. One undergrad is working on reinforcement learning in 360-degree games."

Purdue IE offers other Data Science courses such as "Optimization for Big Data" (IE 590) and "Multi-Agent Optimization" (IE 690), both taught by Dr. Aldo Scutari; "Predictive Modeling" (IE 590; formerly "Advanced Data Analytics" and "Risk and Decision Analysis") and "Probability and Statistics for Engineers II" (IE 330), both taught by Dr. Roshanak Nateghi; "Information Engineering" (IE 590) taught by Dr. Nagabhushana Prabhu; and "Stochastic Network Analysis" (IE 690), taught by Dr. Christopher Quinn. In addition, a new course will be offered for the first time during the spring 2018 semester, "Sensing Approaches for Human Factors Research" (IE 690), taught by Dr. Denny Yu.

"IEs have always been at the forefront of data-driven decision making, so it is only natural that we incorporate advanced data analysis methods into the curriculum," states Brunese. "By incorporating these courses, we can ensure our students have the tools to collect, analyze, and direct decision making across the entire spectrum of data scales."