2012

Application of Pretrained Deep Neural Networks to Large Vocabulary Speech Recognition

Navdeep Jaitly, P. Nguyen, A. Senior, Vincent Vanhoucke

citations

Cite Score

17

AI summary

This paper introduces a new ASR system that uses DBN-pretrained ANN/HMM models, trained on large datasets of Voice Search and YouTube data, to outperform GMM/HMM baselines by 3.7% and 4.7% absolute WER, respectively, with further gains from MMI fine-tuning and SCARF model combination.

Main Contributions

  • Demonstrates that ANN/HMM hybrids pretrained with DBNs can outperform GMM/HMM systems in ASR.
  • Uses two large datasets (5780 hours of Voice Search and 1400 hours of YouTube data) to train and evaluate the models.
  • Achieves a 3.7% absolute WER improvement over the GMM/HMM baseline on the Voice Search dataset.
  • Achieves a 4.7% absolute WER improvement over the GMM/HMM baseline on the YouTube dataset.
  • Shows additional gains from MMI fine-tuning and model combination using SCARF.

Abstract

The use of Deep Belief Networks (DBN) to pretrain Neural Networks has recently led to a resurgence in the use of Artificial Neural Network Hidden Markov Model (ANN/HMM) hybrid systems for Automatic Speech Recognition (ASR). In this paper we report results of a DBN-pretrained context-dependent ANN/HMM system trained on two datasets that are much larger than any reported previously with DBN-pretrained ANN/HMM systems - 5870 hours of Voice Search and 1400 hours of YouTube data. On the first dataset, the pretrained ANN/HMM system outperforms the best Gaussian Mixture Model - Hidden Markov Model (GMM/HMM) baseline, built with a much larger dataset by 3.7% absolute WER, while on the second dataset, it outperforms the GMM/HMM baseline by 4.7% absolute. Maximum Mutual Information (MMI) fine tuning and model combination using Segmental Conditional Random Fields (SCARF) give additional gains of 0.1% and 0.4% on the first dataset and 0.5% and 0.9% absolute on the second dataset.

Citation Graph

Loading graph...

References [16]

Sort:
Filter:

Geoffrey Hinton, Ruslan Salakhutdinov - 2006

37 papers in library cite

Jeffrey Dean, Sanjay Ghemawat - 2004

4 papers in library cite

Geoffrey E. Hinton, S. Osindero, Y. Teh - 2006

43 papers in library cite

Pascal Vincent, Hugo Larochelle, Yoshua Bengio, Pierre Antoine Manzagol - 2008

25 papers in library cite

G. Dahl, D. Yu, L. Deng, Alex Acero - 2012

19 papers in library cite

Vincent Vanhoucke, A. Senior, Mark Z. Mao - 2011

4 papers in library cite

A. Mohamed, G. Dahl, Geoffrey Hinton - 2012

12 papers in library cite

George E. Dahl, Marc'aurelio Ranzato, A. Mohamed, Geoffrey E. Hinton - 2010

6 papers in library cite

V. Mnih - 2009

5 papers in library cite

D. Povey, D. Kanevsky, Brian Kingsbury, Bhuvana Ramabhadran, G. Saon, K. Visweswariah - 2008

4 papers in library cite

F. Seide, G. Li, D. Yu - 2011

4 papers in library cite

N. Morgan, H. Bourlard - 1990

3 papers in library cite

Geoffrey Zweig, P. Nguyen, D. V. Compernolle, K. Demuynck, L. Atlas, Peter Clark, G. Sell, Mingliang Wang, F. Sha, H. Hermansky, D. Karakos, A. Jansen, S. Thomas, G. S. V. S. Sivaram, S. Bowman, J. Kao - 2011

3 papers in library cite

A. Mohamed, T. N. Sainath, George E. Dahl, Bhuvana Ramabhadran, Geoffrey E. Hinton, M. Picheny - 2011

2 papers in library cite

M. Gales - 1999

2 papers in library cite

Cited by

6

papers in your library

Cites

7

papers in your library

Read

on October 21, 2025

Your review

Tags

Paper Aliases

No aliases