Monday: 4:15PM to 5:15PM
Wednesday: 11:00AM to 12:00PM
Tue & Thur. 12:00pm - 2:00pm
|Class Time and Location||MoWe 2:45PM - 4:05PM, LC 22|
A course on data mining (finding patterns in data) algorithms and their application to interesting data types and situations. We cover algorithms that addresses the five core data mining tasks: prediction, classification, regression, clustering, and associations. Course projects will involve advanced topics such as algorithm developments for handling large data sets, sequential, spatial, and streaming data. Prerequisite(s): A Csi 310.
|Data Mining, The Textbook
Charu C. Aggarwal
|Introduction to Data Mining
Pang-Ning Tan, Michael Steinbach, Vipin Kumar
The schedule indicates the concepts and material to be covered in each week under the column labeled "Topics". Each topci with "*" mark will be presented by a six- member team.
|Week||Date||Lecture Topics||Presentation||Read||Due (To be announced)|
|1/27||Data Collection (Twitter, Craigslist, and Foursquare)||Ch2,Ch3|
|3||2/1||Data Collection (Twitter, Craigslist, and Foursquare)||Ch2,Ch3|
|2/3||Data Collection (Twitter, Craigslist, and Foursquare)||Ch2,Ch3|
|4||2/8||Data Collection (Twitter, Craigslist, and Foursquare)|| ||Ch2,Ch3|
|6||2/22||Classification - Introduction - Decision Tree||Ch4|
|2/24||Classification - Support Vector Machines||Ch4|
|7||2/29||Classification - Support Vector Machines - Continued||Ch4, Ch5|
|3/2||Classification - Support Vector Machines - Continued||Ch4, Ch5|
|8||3/7||Classification - Support Vector Machines - Continued||Ch4, Ch5|
|3/9||Classification - Support Vector Machines - Continued||Ch4, Ch5|
|10||3/21||Clustering - K-means||Ch8, Ch9|
|3/23||Clustering - Cluster Analysis||Ch8, Ch9|
|11||3/28||Midterm Exam (Concepts, Close Book)|
|3/30||Clustering - Cluster Analysis - Continued;
Course Project Discussion;
|4/4||Clustering - Cluster Analysis - Continued;
Course Project Discussion;
|12||4/6||Association Rule Mining: Support, Confidence||Ch4, Ch5|
|4/11||Association Rule Mining - Continued||Ch4, Ch5|
|13||4/13||Recommendaiton System||Reference materials on blackboard|
|4/18||Recommendaiton System - Continued||Reference materials on blackboard|
|14||4/20||Sequential Data: Markov Model||Reference materials on blackboard|
|4/25||Sequential Data: Markov Model||Reference materials on blackboard|
|15||4/27||Graph Data: Probabilistic Soft Logic||Reference materials on blackboard|
|5/2||Graph Data: Probabilistic Soft Logic||Reference materials on blackboard|
|16||5/4||Graph Data: Probabilistic Soft Logic||Reference materials on blackboard|
|17||5/14||Project Presentation (3:30AM – 5:30pm at LC22)|
Course Project Requirement
Course Project teams:
to be announced
References for Lecture Topics:
1. Decision Tree
 Decision Tree Lecture Slides: http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap4_basic_classification.ppt (http://www-users.cs.umn.edu/~kumar/dmbook/dmslides/chap4_basic_classification.pdf)
 Decision Tree 7 minutes tutorial video: https://www.youtube.com/watch?v=a5yWr1hr6QY
2. Logistic Regression
 Machine Learning with Python - Logistic Regression: http://aimotion.blogspot.com/2011/11/machine-learning-with-python-logistic.html
 A Tutorial in Logistic Regression: http://www.statpt.com/logistic/demaris_1995.pdf
Examinations and Assignments:
There are around 12 homework assignments. Homework assignments are due at the start of class. If you have an excused absence from a class, turn in the homework assignment prior to the class session. All assignments must have your name, student ID and course name/ number.
Late Submission Policy:
Assignments must be submitted before the class on the specified due date (Monday of designated week). A penalty of 30% will be deducted from your score for the first 24-hour period if your assignment is late. A penalty of 70% will be deducted from your score for >= 24-hour period. Assignments submitted more than 3 days late will not be assessed and will score as a zero (0). Weekend days will be counted. For assignments, you are encouraged to type your answers.
Policy on Cheating:
Cheating in an exam will result in an E grade for the course. Further, the students involved will be referred to the Dean's oce for disciplinary action.
Homework problems are meant to be individual exercises; you must do these by yourself. Any of the following actions will be considered as cheating.
Cheating in a homework exercise will result in the following penalty for all the students involved.
Students who cheat in two or more homeworks will receive an E grade for the course. The names of such students will also be forwarded to the Dean's oce for disciplinary action.
Class attendance is required and checked. Each case of missing class without a proper explanation will cause 20% less from your final numerical grade. If you miss a class, it is your responsibility to find out the material covered in the class. There will absolutely no makeup classes. Only in specific, unavoidable situations students are allowed to excuse absences from class: 1) personal emergencies, including, but not limited to, illness of the student or of a dependent of the student, or death in the family [Require doctor's note]; 2) religious observances that prevent the student from attending class; 3) participation in University-sponsored activities, approved by the appropriate University authority, such as intercollegiate athletic competitions, activities approved by academic units, including artistic performances, academic field trips, and special events connected with coursework; 4) government-required activities, such as military assignments, jury duty, or court appearances; and 5) any other absence that the professor approves.
Homework Assignments : 35% | Exam: 30% | Presentation: 5% | Final Project (3-member team): 25% | Class Discussion and Participation: 5%