Information Retrieval and
Extraction
Fall 2005
Tuesdays,
2:10 ~5:00 PM
Instructor:
Berlin Chen (陳柏琳
助理教授)
Tentative Topic List and Schedule:
9/13 |
Course Overview & Introduction |
||
9/20 |
Retrieval Models (I) - Classic
Retrieval Models (Boolean, Vector Space and Probabilistic Models) |
||
9/27 |
Retrieval
Performance Evaluation
(I) - Measures |
HW-01:IR Performance Evaluation Due 10/11 |
|
10/4 |
Retrieval Performance Evaluation (II) - Reference Collections |
||
10/11 |
Retrieval
Models (II) - Improved Approaches (Fuzzy Set, Extended Boolean, Generalized Vector Space Models) |
||
10/18 |
Query Operations (Query Expansion and Term Re-weighting)
|
HW-02:IR Models and Query Reformulations Due 11/8 |
|
10/25 |
Retrieval Models
(III) - Statistical Modeling Approaches (HMM/N-Gram: Language Model
Approach ) |
||
11/1 |
Retrieval Models (III) - Statistical Modeling Approaches (TMM: Topical
Mixture Model) |
||
11/8 |
Retrieval Models (III) - Statistical Modeling Approaches (LSA, PLSA) & LSA Toolkit Relevance Models (Preliminary) |
HW-03:LSI
Retrieval Model Due 11/29 |
|
11/15 |
Midterm |
||
11/22 |
Text Clustering |
||
11/29 |
Retrieval Models (IV) - Structural Retrieval Models and Browsing Models
|
||
12/6 |
Query Languages,
Text Statistics |
||
12/13 |
Text
Operations |
||
12/20 |
Invited Talk,
陳俊良先生 (新視科技總經理)
Information Retrieval & Digital Archive Management |
||
12/27 |
Paper Survey (I) 陳鴻彬:Simplified Similarity Scoring Using Term Ranks (SIGIR2005) 許庭瑋:When Will Information Retrieval Be “Good Enough”? (SIGIR2005) 李家豪:Dependence Language Model for Information Retrieval (SIGIR2004) |
||
1/3 |
Paper Survey (II) 朱芳輝:Gravitation-Based Model for Information Retrieval (SIGIR2005) 白聖秋:Indexing and Ranking in Geo-IR Systems 張日青:The Maximum Entropy Method for Analyzing Retrieval Measuring 徐志文:Exploiting the Hierarchical Structure for Link Analysis (SIGIR2005) 游斯涵:MultiLabel Informed Latent Semantic Indexing (SIGIR2005) 林士翔:Relevance Information: A Loss of Entropy but a Gain for IDF? (SIGIR2005) |
starting from 1:00 pm | |
1/10 |
Indexing and
Searching |
||
1/17 |
Final |
||
Chinese Spoken Document Recognition, Organization and Retrieval |
Textbook:
1. |
R. Baeza-Yates and B.
Ribeiro-Neto, Modern Information Retrieval, Addison Wesley Longman, 1999. |
|
2. |
W.B. Croft and J. Lafferty (eds), Language Models for Information Retrieval, Kluwer International Series on Information Retrieval, Volume 13, Kluwer Academic Publishers, 2002. | |
References:
Books:
Papers:
1. | D. Blei, A. Ng, and M. Jordan, "Latent Dirichlet allocation," Journal of Machine Learning Research, 3:993-1022, January 2003. | |
2. | V. Lavrenko and W.B. Croft, "Relevance-Based Language Models" ACM SIGIR 2001. | |
3. | C. H. Papadimitriou, P. Raghavan, H. Tamaki, S. Vempala, "Latent semantic indexing: A probabilistic analysis,'' analyzes an information retrieval technique related to principle components analysis. | |
4. | Liu, X. and Croft, W.B., "Statistical Language Modeling For Information Retrieval," the Annual Review of Information Science and Technology, vol. 39, 2005 | |
5. | Lan Huang. A Survey On Web Information Retrieval Technologies. 2000. | |