Conference article

Garnishing a phonetic dictionary for ASR intake

Iben Nyholm Debess
Grunnurin Føroysk Teldutala, Denmark

Sandra Saxov Lamhauge
Danish Language Council, Denmark

Peter Juel Juel Henrichsen
Danish Language Council, Denmark

Download article

Published in: Proceedings of the 22nd Nordic Conference on Computational Linguistics (NoDaLiDa), September 30 - October 2, Turku, Finland

Linköping Electronic Conference Proceedings 167:47, p. 395--399

NEALT Proceedings Series 42:47, p. 395--399

Show more +

Published: 2019-10-02

ISBN: 978-91-7929-995-8

ISSN: 1650-3686 (print), 1650-3740 (online)

Abstract

We present a new method for preparing a lexical-phonetic database as a resource for acoustic model training. The research is an offshoot of the ongoing Project Ravnur (Speech Recognition for Faroese), but the method is language-independent. At NODALIDA 2019 we demonstrate the method (called SHARP) online, showing how a traditional lexical-phonetic dictionary (with a very rich phone inventory) is transformed into an ASR-friendly database (with reduced phonetics, preventing data sparseness). The mapping procedure is informed by a corpus of speech transcripts. We conclude with a discussion on the benefits of a well-thoughtout BLARK design (Basic Language Resource Kit), making tools like SHARP possible.

Keywords

No keywords available

References

No references available

Citations in Crossref