CSI 661 - Data Mining

MWF 11:15am - 12:05pm @ HU112

Credits: 1-3

Office Hours: Monday, Wednesday 4pm till 5pm

Optional Text: Data Mining: Introduction and Advanced Topics by Margret Dunham

You can buy it through Amazon or Barnes and Noble 

Text Book, Topic List and  Reading Material List

Assignments

Assignment #1

Vote Data Set - Predict the "class. This data set contains the voting patterns of US representatives on various issues. The Class variable is whether they are democrat or republican." attribute

Pima Indians Data Set - Predict the "class. This data set contains information on Pima Indians and can be used to predict if they have diabetes." attribute

Assignment #2 Part A)

Assignment #2 Part B)

Assignment #3 Part A)

Assignment #3 Part B)

or

Assignment #4

Concentration Area

Software

WEKA: Does Clustering, Classification , Ensembles, Association Rules and Regression

(Requires JAVA, be sure to download the version that automatically installs Java if you have Windows).

Decision Tree Induction (At the Bottom of the Page): C4.5

Lectures

Lecture 1): Class Overview and Logistics

Lecture 2,3): Association Rules - Introduction

Lecture 4): Sequential Association Rules

Lecture 5): Assocation Rules and Bilogical Contact Maps. We will discuss this paper

Lecture 6,7,8): Introduction to Classification

Lecture 9,10): More Classification

Lecture 11): Naive Bayes Classifier

Lecture 12-13): Introduction to Clustering

Lecture 14-15): More on Clustering

Lecture 16-18): Clustering Large Data Bases

Lecture 19): Clustering Non-Vector Data Using

Lecture 20-21): Ensembles of Classifiers/Clustering

Lecture 22): Classifying Streaming Data Using Ensembles