Speech Recognition

 Summer 2004 (at TSFST, Hsinchu)
Tuesdays, 18:30 ~21:30

Instructor: Berlin Chen


Topic List and Schedule:

Date

Topic  
 7/6
 
Course Overview & Introduction, Spoken Language Structure
 
7/13
 
Hidden Markov Models
 
HW-1
 
7/20
 
Break
 
7/27
 
Hidden Markov Models (cont.)
 
HW-2
 
8/3
 
Statistical Language Modeling
 
8/10
 
Statistical Language Modeling (cont.) & Isolated Word Recognition: An Example

 
HW-3:IsolatedWordRecognition
 
8/17
 
Acoustic Modeling
 
HW-4:HTK_Tutorial_Materials
 
8/24
 
Break
 
8/31

 
Hidden Markov Toolkit
Explanation of the Isolated Word Recognition Program

 
Digital Signal Processing
9/7

 

Speech Signal Representations
Principal Component Analysis (PCA) & Linear Discriminant Analysis (LDA)

 

9/14
 
Search Algorithms for Word Recognition, Keyword Spotting
Large Vocabulary Continuous Speech Recognition
 
9/21
 
Speech Enhancement & Robustness
 

 

Textbook:
     1.   X. Huang, A. Acero, H. Hon, “Spoken Language Processing,” Prentice Hall, 2001 (全華代理)
     2.
  C. Manning and H. Schutze, Foundations of Statistical Natural Language Processing, MIT Press, 1999.   

References:

 
Books:
     1.  T. F. Quatieri,“Discrete-Time Speech Signal Processing - Principles and Practice,” Prentice Hall, 2002
     2.  J. R. Deller, J. H. L. Hansen, J. G. Proakis, “Discrete-Time Processing of Speech Signals,” IEEE Press, 2000
     3.  F. Jelinek, "Statistical Methods for Speech Recognition," The MIT Press, 1999
     4.  S. Young et al., “The HTK Book”, Version 3.2, 2002. "http://htk.eng.cam.ac.uk"
     5.  L. Rabiner, B.H. Juang, “Fundamentals of Speech Recognition”, Prentice Hall, 1993

  Papers:
     1. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech 
         Recognition,” Proceedings of the IEEE, vol. 77, No. 2, February 1989
     2.
A. Dempster, N. Laird, and D. Rubin, "Maximum likelihood from incomplete data via the EM algorithm,"
        J. Royal Star. Soc., Series B, vol. 39, pp. 1-38, 1977
     3. Jeff A. Bilmes  "A Gentle Tutorial of the EM Algorithm and its Application to Parameter
         Estimation for Gaussian Mixture and Hidden Markov Models," U.C. Berkeley TR-97-021

      4.
J. W. Picone, “Signal modeling techniques in speech recognition,” proceedings of the
          IEEE, September 1993, pp. 1215-1247
     5. R. Rosenfeld, ”Two Decades of Statistical Language Modeling: Where Do We Go from
         Here?,” Proceedings of IEEE, August, 2000
     6. Hermann Ney, “Progress in Dynamic Programming Search for LVCSR,” Proceedings of the IEEE, August 2000
     7. "Progress in Dynamic Programming Search for LVCSR", Proceedings of the IEEE, 88(8), August 2000.
     8.  H. Hermansky, "Should Recognizers Have Ears?", Speech Communication, 25(1-3), 1998.