Notice: For the latest information and guidance on Purdue's response to COVID-19 please visit:

Data Mining


Credit Hours:


Learning Objective:

Students that successfully complete the course will be able to: 1. Identify key elements of data mining systems and the knowledge discovery process; 2. Understand how algorithmic elements interact; 3. Recognize various types of data mining tasks; 4. Implement and apply basic algorithms and standard models; 5.Understand how to evaluate performance.


Data Mining has emerged at the confluence of artificial intelligence, statistics, and databases as a technique for automatically discovering summary knowledge in large datasets. This course introduces students to the process and main techniques in data mining, including classification, clustering, and pattern mining approaches. Data mining systems and applications will also be covered, along with selected topics in current research.

Topics Covered:

1. Introduction (1 week); 2. Background and basics (1 week); 3. Predictive Modeling (3 weeks); 4. Understanding and Extending Model Performance (1 week); 5. Descriptive Modeling (3 weeks); 6. Pattern Mining (3 weeks).


A bachelor degree in computer science or an equivalent field. Students not in the Computer Science master's program should seek department permission to register.
An introductory statistics course and a course that covers basic programming skills, or permission of instructor.

Applied / Theory:

50 / 50

Web Address:

Web Content:

Brightspace will be used for grades.


6 assignments total. Submitted online via Blackboard or Turnin.




One midterm and a final exam


Official textbook information is now listed in the Schedule of Classes. NOTE: Textbook information is subject to be changed at any time at the discretion of the faculty member. If you have questions or concerns please contact the academic department.
Optional: Principles of Data Mining (Adaptive Computation and Machine Learning), 1st Edition, David Hand, ISBN:9780262082907

Computer Requirements:

Programming assignments will use Python, statistical assignments will use R (open source statistical software).

ProEd Minimum Requirements:


Tuition & Fees: