CS 558

Data Mining

Semester: Spring, 2001
Schedule: Tuesday 13:40 - 14:30 (T6); Thursday 12:40 - 14:30 (R6R7)
Office Hours: Wednesday 15:40 - 17:40 (W8W9)
Classroom: EA-502
Instructor: H. Altay Güvenir

Text Book:
Ian H. Witten and Eibe Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann, (2000).

Recommended Journals:
Data Mining and Knowledge Discovery, Intelligent Data Analysis, Machine Learning, Artificial Intelligence, Knowledge-Based Systems, Applied Intelligence, Journal of Artificial Intelligence Research, Knowledge and Information Systems, IEEE Transactions on Pattern Analysis and Machine Intelligence.

Weekly Program

WEEK  DAYS        TOPICS
   1  Feb  6,  8  Introduction, Clustering
   2  Feb 13, 15  Concept learning, Classification, Categorization
   3  Feb 20,     Association Rules
              22  Seminar
   4  Feb 27,  1  Seminar
   5  Mar  6,  8  Holiday
   6  Mar 13, 15  Seminar
   7  Mar 20, 22  Seminar
   8  Mar 27, 29  Seminar
   9  Apr  3,  5  Seminar
  10  Apr 10,     Seminar
              12  Workshop
  11  Apr 17, 19  Workshop
  12  Apr 24, 26  Workshop
  13  May  1,  3  Workshop
  14  May  8, 10  Workshop
  15  May 15, 17  Workshop

Work Load

Seminar: Each student will present a journal paper, or a small set of conference papers preferably on the same topic. Students are free to select the paper(s) they would like to present as long as they are in the scope of the course. Students will determine and give a copy of the paper(s) they will present to the instructor by February, 20. A link to an on-line version of the paper is preferred. Students are responsible to provide a copy of their papers to the rest of the class, if the paper is not available on-line. Presenters will be graded on several criteria e.g., organization, clarity, examples, slides, timing, and responses to the questions and comments. The complete schedule of the presentations will be announced on February 21.

Other than his/her presentation, each student is expected to read the papers to be presented by others before the presentation. After each presentation, we will have a discussion session. Each student will be graded in his/her participation to the seminar.

Students are referred to the paper "How to Present a Paper in Theoretical Computer Science: A Speaker's Guide for Students" for a successful presentation.

Workshop: We will organize a workshop during the remaining portion of the semester. Each student will conduct an experiment, testing new ideas preferably on the area of their presentation topic. Then, the student will prepare a short paper (about 8-10 pages) reporting his/her experiment(s) along with the interpretation of the results and pointers for further research. The paper should have the quality of, at least, a national symposium paper. Three copies of papers will be submitted to the instructor by April 5. The papers will be distributed for reviewing on April 6. Each paper will be reviewed by two randomly selected classmates. Each reviewer will put his/her comments and suggestions on the paper and return them back to the instructor by April 10. The papers with peer reviews will then be returned to the authors on April 12. Instructor's comments will be returned by May 17. The authors will revise (if necessary) their papers in the light of the reviews, and submit the final copies by May, 24 to the instructor.

Students are referred to the paper "How to give a good research talk" for a successful presentation of your own work.

Presentations during seminars and workshop will be made using the datashow equipment connected to a PC that will be made available in the class. Therefore, the students are asked to prepare their presentations using the PowerPoint® program.

Grading Policy:

  Seminar:  50% (Presentation: 40% + Participation: 10%)
  Workshop: 50% (Paper: 40% + Review: 10%)