2012

Japanese and Korean Voice Search

M. Schuster, Kaisuke Nakajima

citations

Cite Score

42

AI summary

This paper introduces a new voice search system for Japanese and Korean, addressing challenges like infinite vocabulary and multiple script languages by modeling completely in the written domain for language model and dictionary, achieving simplification and speed-up of the process to deal with new languages in general.

Main Contributions

  • Developed a voice search system for Japanese and Korean at Google.
  • Addressed challenges of infinite vocabulary and multiple script languages.
  • Modeled completely in the written domain for language model and dictionary to avoid system complexity.
  • Introduced a purely data-driven segmenter applicable to any language without modification.
  • Simplified the process of building voice search systems for new languages.

Abstract

This paper describes challenges and solutions for building a successful voice search system as applied to Japanese and Korean at Google. We describe the techniques used to deal with an infinite vocabulary, how modeling completely in the written domain for language model and dictionary can avoid some system complexity, and how we built dictionaries, language and acoustic models in this framework. We show how to deal with the difficulty of scoring results for multiple script languages because of ambiguities. The development of voice search for these languages led to a significant simplification of the original process to build a system for any new language which in in parts became our default process for internationalization of voice search.

Citation Graph

Loading graph...

References [7]

Sort:
Filter:

D. Povey, D. Kanevsky, Brian Kingsbury, Bhuvana Ramabhadran, G. Saon, K. Visweswariah - 2008

4 papers in library cite

M. Gales - 1999

2 papers in library cite

T. Hughes, Kaisuke Nakajima, L. Ha, A. Vasu, P. Moreno, M. Lebeau - 2010

1 paper in library cites

Y. Sung, M. Jansche, P. Moreno - 2011

1 paper in library cites

J. Schalkwyk, D. Beeferman, F. Beaufays, B. Byrne, C. Chelba, Michael Cohen, M. Garrett, B. Strope - 2010

1 paper in library cites

C. Chelba, J. Schalkwyk, T. Brants, V. Ha, B. Harb, W. Neveitt, C. Parada, P. Xu - 2010

1 paper in library cites

J. Shan, G. Wu, Z. Hu, X. Tang, M. Jansche, P. Moreno - 2010

1 paper in library cites

Cited by

3

papers in your library

Cites

0

papers in your library

Read

on July 1, 2025

Your review

Tags

Paper Aliases

No aliases