Speech Signal Processing

Speech Signal Processing

Fall 2003
Tuesdays, 9:10 ~12:00 AM

Instructor: Berlin Chen

Topic List and Schedule:

Date
Topic Homework / Project

9/9
　 Course Overview & Introduction 　

9/16

　 Spoken Language Structure

　 Homework-1:Depict a spectrogram of a speech utterance with your own name pronounced. (Due: 9/30)
(Please observe the formants and harmonics of the fundamental frequency)
See Results

9/23

　 Hidden Markov Models (I)

　 Homework-2: Solving the Problems 1* and 2** for HMM (Due: 10/15)
( *Problem 1 should be solved with Forward Algorithm and Backward Algorithm, respectively.
**Problem 2 should be solved with Viterbi Algorithm in both forward and backward directions.)

9/30
　 Hidden Markov Models (II)
　 Homework-3: Solving the Problem 3 for HMM (Baum-Welch Training) (Due: 10/28)

10/7

　 Hidden Markov Models (III)
- Expectation Maximization (EM) Algorithm
- Review of Estimation Theory

10/14
　 Review of Digital Signal Processing
　

10/21

　 Review of Digital Signal Processing
Speech Signal Representations
　 Project-1: Small-Vocabulary, Isolated Word Recognition (Due 11/10)
　

10/28
　 Midterm
　

11/4

　 Speech Signal Representations
Linear Prediction Coding of Speech Signals
　 Project-2: linear prediction coding (Due 11/28)

　

11/11

　 Linear Prediction Coding of Speech Signals
Language Modeling (I)
　

11/18

　 Language Modeling (I)
Acoustic Modeling (I):
　

11/25

　 Acoustic Modeling (II): Cambridge Hidden Markov Model Toolkit(HTK)

　 Homework 4: Exercises on HTK Toolkit (Due 12/2)

　

12/2

　 Acoustic Modeling (I): Triphone Modeling, CART etc.
Search Algorithms
　 Homework 5: Derive the equations of likelihood gains used for data splitting, on P. 179-180 of the textbook (Due 12/9)

12/9
　 Invited Speaker: Roger Kuo (郭人瑋)
Acoustic Modeling (III): Adaptation Techniques for Acoustic Models
　

12/16

　 Invited Speaker: Louis Tasi (蔡文鴻)
Language Modeling (II): SRI Language Modeling Libraries and Tools
Language Modeling (III): Adaptation Techniques for Language Models
　

12/23

　 Search Algorithms
Large Vocabulary Continuous Speech Recognition (LVCSR)
　
　

12/30
　 Robustness Techniques for Feature Extraction
　
　

1/6
　 Final Exam
　
　

Discriminant Feature Extraction and Dimension Reduction
Spoken Dialogue Techniques

Textbook:
     1.   X. Huang, A. Acero, H. Hon, “Spoken Language Processing,” Prentice Hall, 2001 (全華代理)

References:

Books:
     1. T. F. Quatieri,“Discrete-Time Speech Signal Processing - Principles and Practice,” Prentice Hall, 2002
     2. J. R. Deller, J. H. L. Hansen, J. G. Proakis, “Discrete-Time Processing of Speech Signals,” IEEE Press, 2000
     3. F. Jelinek, "Statistical Methods for Speech Recognition," The MIT Press, 1999
     4. S. Young et al., “The HTK Book”, Version 3.2, 2002. "http://htk.eng.cam.ac.uk"
     5. L. Rabiner, B.H. Juang, “Fundamentals of Speech Recognition”, Prentice Hall, 1993

Papers:
     1. Rabiner, “A Tutorial on Hidden Markov Models and Selected Applications in Speech
         Recognition,” Proceedings of the IEEE, vol. 77, No. 2, February 1989
     2. A. Dempster, N. Laird, and D. Rubin, "Maximum likelihood from incomplete data via the EM algorithm,"
        J. Royal Star. Soc., Series B, vol. 39, pp. 1-38, 1977
     3. Jeff A. Bilmes "A Gentle Tutorial of the EM Algorithm and its Application to Parameter
         Estimation for Gaussian Mixture and Hidden Markov Models," U.C. Berkeley TR-97-021
      4. J. W. Picone, “Signal modeling techniques in speech recognition,” proceedings of the
          IEEE, September 1993, pp. 1215-1247
     5. R. Rosenfeld, ”Two Decades of Statistical Language Modeling: Where Do We Go from
         Here?,” Proceedings of IEEE, August, 2000
     6. Hermann Ney, “Progress in Dynamic Programming Search for LVCSR,” Proceedings of the IEEE, August 2000
     7. "Progress in Dynamic Programming Search for LVCSR", Proceedings of the IEEE, 88(8), August 2000.
     8. H. Hermansky, "Should Recognizers Have Ears?", Speech Communication, 25(1-3), 1998.
　

Date	Topic	Homework / Project
9/9	Course Overview & Introduction
9/16	Spoken Language Structure	Homework-1:Depict a spectrogram of a speech utterance with your own name pronounced. (Due: 9/30) (Please observe the formants and harmonics of the fundamental frequency) See Results
9/23	Hidden Markov Models (I)	Homework-2: Solving the Problems 1* and 2** for HMM (Due: 10/15) ( Problem 1 should be solved with Forward Algorithm and Backward Algorithm, respectively. *Problem 2 should be solved with Viterbi Algorithm in both forward and backward directions.)
9/30	Hidden Markov Models (II)	Homework-3: Solving the Problem 3 for HMM (Baum-Welch Training) (Due: 10/28)
10/7	Hidden Markov Models (III) - Expectation Maximization (EM) Algorithm - Review of Estimation Theory
10/14	Review of Digital Signal Processing
10/21	Review of Digital Signal Processing Speech Signal Representations	Project-1: Small-Vocabulary, Isolated Word Recognition (Due 11/10)
10/28	Midterm
11/4	Speech Signal Representations Linear Prediction Coding of Speech Signals	Project-2: linear prediction coding (Due 11/28)
11/11	Linear Prediction Coding of Speech Signals Language Modeling (I)
11/18	Language Modeling (I) Acoustic Modeling (I):
11/25	Acoustic Modeling (II): Cambridge Hidden Markov Model Toolkit(HTK)	Homework 4: Exercises on HTK Toolkit (Due 12/2)
12/2	Acoustic Modeling (I): Triphone Modeling, CART etc. Search Algorithms	Homework 5: Derive the equations of likelihood gains used for data splitting, on P. 179-180 of the textbook (Due 12/9)
12/9	Invited Speaker: Roger Kuo (郭人瑋) Acoustic Modeling (III): Adaptation Techniques for Acoustic Models
12/16	Invited Speaker: Louis Tasi (蔡文鴻) Language Modeling (II): SRI Language Modeling Libraries and Tools Language Modeling (III): Adaptation Techniques for Language Models
12/23	Search Algorithms Large Vocabulary Continuous Speech Recognition (LVCSR)
12/30	Robustness Techniques for Feature Extraction
1/6	Final Exam
	Discriminant Feature Extraction and Dimension Reduction Spoken Dialogue Techniques