Discriminative Learning for Speech Recognition: Theory and Practice (Synthesis Lectures on Speech and Audio Processing #4) (Paperback)

Discriminative Learning for Speech Recognition: Theory and Practice (Synthesis Lectures on Speech and Audio Processing #4) Cover Image
$60.40
Not currently in store. Available to ship from distributor's warehouse.

Description


In this book, we introduce the background and mainstream methods of probabilistic modeling and discriminative parameter optimization for speech recognition. The specific models treated in depth include the widely used exponential-family distributions and the hidden Markov model. A detailed study is presented on unifying the common objective functions for discriminative learning in speech recognition, namely maximum mutual information (MMI), minimum classification error, and minimum phone/word error. The unification is presented, with rigorous mathematical analysis, in a common rational-function form. This common form enables the use of the growth transformation (or extended Baum-Welch) optimization framework in discriminative learning of model parameters. In addition to all the necessary introduction of the background and tutorial material on the subject, we also included technical details on the derivation of the parameter optimization formulas for exponential-family distributions, discrete hidden Markov models (HMMs), and continuous-density HMMs in discriminative learning. Selected experimental results obtained by the authors in firsthand are presented to show that discriminative learning can lead to superior speech recognition performance over conventional parameter learning. Details on major algorithmic implementation issues with practical significance are provided to enable the practitioners to directly reproduce the theory in the earlier part of the book into engineering practice.


Product Details
ISBN: 9781598293081
ISBN-10: 1598293087
Publisher: Morgan & Claypool
Publication Date:
Pages: 124
Language: English
Series: Synthesis Lectures on Speech and Audio Processing