Information Retrieval and
Extraction
Spring, 2003
Tentative Topic List and Schedule
2/21 |
Course Overview & Introduction |
|
2/28 |
Break |
|
3/7 |
Retrieval Models (I) - Classic
Retrieval Models: Boolean, Vector Space and Probabilistic Models |
|
3/14 |
Retrieval
Evaluation (I) - Measures Retrieval Evaluation (II) - Reference Collections HW#1: Evaluation Measures (Due 3/28) HW#2: Classic Retrieval Models (Due 4/11) |
|
3/21 |
Retrieval Models (II) - Structural
Retrieval Models and Browsing Models Retrieval Models (III) - Fuzzy Set, Extended Boolean, Generalized Vector Space Models |
|
3/28 |
Query Operations (Query Expansion and Term
Re-weighting) HW#3: Relevance Feedback or Local Analysis (Due 4/25) |
|
4/4 |
Break |
|
4/11 |
Query Operations (Query Expansion and Term
Re-weighting) |
|
4/18 |
Retrieval Models (IV) -
HMM/N-gram-based, LSI, PLSA HW#4: HMM/N-gram-based and PLSI Retrieval Models (Due 5/16) |
|
4/25 |
Midterm |
|
5/2 |
Retrieval Models (IV) -
HMM/N-gram-based, LSI, PLSA Query Languages |
|
5/9 |
Text Languages and Text
Statistics |
|
5/16 |
Text Preprocessing, Text Compression Text Clustering Techniques HW#5: A Web-based IR System (Due 6/20) (Features included: character overlapping bigrams as indexing terms, inverted file structure, query expansion, client-server networking architecture) |
|
5/23 |
Indexing and Searching
(Preliminary Version) |
|
5/30 |
Paper Presentation (I): 黃立德: Boosting for Document Routing, ACM CIKM 2000 鄭德義: Cross-Document Summarization by Concept Classification, SIGIR 2002 江漢昇: Improving realism of topic tracking evaluation,SIGIR 2002 黃士傑: Set-based model-a new approach for information retrieval, SIGIR 2002 |
|
6/6 |
Talk Title: "Technologies behind
Internet Search Engine"
Invited Speaker: Ming-Jer Lee, CTO, VisionNEXT Co. |
|
6/13 |
Paper Presentation (II):
郭人瑋: Generic Summarization and Keyphrase Extraction Using Mutual Reinforcement Principle and Sentence Clustering, SIGIR 2002 黃耀民: Expressive Retrieval from XML documents, SIGIR 2001 劉耀才: Document Clustering with Committees, SIGIR 2002 |
|
6/20 |
Text Categorization Techniques |
|
6/27 |
Final Exam |
|
|
Information Extraction Techniques |
|
Textbook:
1. | R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval, Addison Wesley Longman, 1999. |
References:
Books:
Papers:
Grading:
1. Final: 20%
2. Presentations 20%
3. Homework: 20%
4. Project: 25%
5. Attendance/Other: 15%