ECE 49595 - Data Mining Basic Concepts and Techniques
Note:
This course will run as ECE 40875 beginning with the Spring 2023 term.
Course Details
Lecture Hours: 3 Credits: 3
Counts as:
- EE Elective
- CMPE Selective
Experimental Course Offered:
Spring 2022
Requisites:
ECE 20875, ECE 30200 (May be taken concurrently)
Requisites by Topic:
Python, Linear Algebra, Probability
Catalog Description:
This course introduces fundamental techniques in data mining, i.e., the techniques that extract useful knowledge from a large amount of data. Topics include data processing, exploratory data analysis, association rule mining, clustering, classification, anomaly detection, recommendation and graph analysis. The applications of these techniques in real-world decision making in various domains, such as science, business, biology, health care, transportation, will be discussed.
Required Text(s):
- Introduction to Data Mining , 2nd Edition , Pang-Ning Tan, Michael Steinbach, Anuj Karpatne, and Vipin Kumar , Pearson , 2019 , ISBN No. 9780133128901
Recommended Text(s):
- Data Mining: Concepts and Techniques , 3rd Edition , Jaiwei Han, Micheline Kamber, and Jian Pei , Morgan Kaufmann Publishers , 2011 , ISBN No. 9780123814791
Learning Outcomes:
- an ability to describe and explain the process of data mining. [1,7]
- an ability to formulate problems in real-world applications into data mining tasks and solve the problems using data mining techniques. [1,2,4]
- an ability to implement software programs that conduct data mining and evaluate the output of data mining programs.. [6]
- an ability to work in a team and present data mining solutions to people in scientific or other disciplines. [3,5]
Lecture Outline:
Weeks | Topic(s) |
---|---|
1.5 | Data: Type of data, data quality, data preprocessing, measure of similarity and dissimilarity, data exploration and visualization. |
2.5 | Association analysis: Frequent itemset generation, rule generation, compact representation of frequent itemsets, rule evaluation and association analysis on relational data. |
1 | Putting association analysis to work in applications: market basket analysis, profile analysis, web log analysis and bioinformatics |
1.5 | Clustering: K-means, hierarchical clustering, spectral clustering and density-based clustering |
1 | Putting clustering to work in applications: customer segmentation, document clustering and community detection |
2 | Classification: Decision tree, rule-based classifier, nearest-neighbor classifier, support vector machines, and ensemble methods |
1 | Putting classification to work in application: churn prediction, character recognition, document classification and user categorization |
1 | Anomaly detection: Statistical, distance-based, density-based and clustering-based approaches |
.5 | Putting anomaly detection to work in applications: fraud detection, intrusion detection and earth science applications |
1 | Recommendation: Collaborative filtering, matrix factorization and applications |
1 | Graph analysis: Note ranking, link prediction and social networks |
Engineering Design Content:
- Synthesis
- Analysis
- Evaluation
Engineering Design Consideration(s):
- Economic
- Environmental
- Health/Safety
- Social
- Societal
- Cultural
Assessment Method:
Exam(s), quizz(es), project(s), presentation(s)