ECE 49595 - Data Mining Basic Concepts and Techniques

Note:

This course will run as ECE 40875 beginning with the Spring 2023 term.

Course Details

Lecture Hours: 3 Credits: 3

Counts as:

  • EE Elective
  • CMPE Selective

Experimental Course Offered:

Spring 2022

Requisites:

ECE 20875, ECE 30200 (May be taken concurrently)

Requisites by Topic:

Python, Linear Algebra, Probability

Catalog Description:

This course introduces fundamental techniques in data mining, i.e., the techniques that extract useful knowledge from a large amount of data. Topics include data processing, exploratory data analysis, association rule mining, clustering, classification, anomaly detection, recommendation and graph analysis. The applications of these techniques in real-world decision making in various domains, such as science, business, biology, health care, transportation, will be discussed.

Required Text(s):

  1. Introduction to Data Mining , 2nd Edition , Pang-Ning Tan, Michael Steinbach, Anuj Karpatne, and Vipin Kumar , Pearson , 2019 , ISBN No. 9780133128901

Recommended Text(s):

  1. Data Mining: Concepts and Techniques , 3rd Edition , Jaiwei Han, Micheline Kamber, and Jian Pei , Morgan Kaufmann Publishers , 2011 , ISBN No. 9780123814791

Learning Outcomes:

A student who successfully fulfills the course requirements will have demonstrated:
  1. an ability to describe and explain the process of data mining. [1,7]
  2. an ability to formulate problems in real-world applications into data mining tasks and solve the problems using data mining techniques. [1,2,4]
  3. an ability to implement software programs that conduct data mining and evaluate the output of data mining programs.. [6]
  4. an ability to work in a team and present data mining solutions to people in scientific or other disciplines. [3,5]

Lecture Outline:

Weeks Topic(s)
1.5 Data: Type of data, data quality, data preprocessing, measure of similarity and dissimilarity, data exploration and visualization.
2.5 Association analysis: Frequent itemset generation, rule generation, compact representation of frequent itemsets, rule evaluation and association analysis on relational data.
1 Putting association analysis to work in applications: market basket analysis, profile analysis, web log analysis and bioinformatics
1.5 Clustering: K-means, hierarchical clustering, spectral clustering and density-based clustering
1 Putting clustering to work in applications: customer segmentation, document clustering and community detection
2 Classification: Decision tree, rule-based classifier, nearest-neighbor classifier, support vector machines, and ensemble methods
1 Putting classification to work in application: churn prediction, character recognition, document classification and user categorization
1 Anomaly detection: Statistical, distance-based, density-based and clustering-based approaches
.5 Putting anomaly detection to work in applications: fraud detection, intrusion detection and earth science applications
1 Recommendation: Collaborative filtering, matrix factorization and applications
1 Graph analysis: Note ranking, link prediction and social networks

Engineering Design Content:

  • Synthesis
  • Analysis
  • Evaluation

Engineering Design Consideration(s):

  • Economic
  • Environmental
  • Health/Safety
  • Social
  • Societal
  • Cultural

Assessment Method:

Exam(s), quizz(es), project(s), presentation(s)