Language and Dialogue Modeling for Acoustic Speech Understanding

Authors: [tex2html_wrap4240]R. De Mori, P. Boucher, A. Corazza (IRST, Trento, Italy), M. Galler, H. Hermansdottir, C. Pateras, C. Snow, A.Takahashi

Category: perception

Subcategory: language and program understanding

Phoneme modeling and lexical representation have been further investigated along the lines established in 1992. Phoneme models have been trained with Gaussian distributions taking into account context clusters determined with simulated annealing. These models have been tested with beam search techniques with good time and accuracy performance.

Effort has been spent on interpretation strategies using stochastic context-free grammars.

Optimal upper bounds have been derived for partially parsed sentences such as patterns used for interpretation and automatically inferred by classification and regression trees. A parser has been implemented for this purpose.

A multi-media rule-based dialogue architecture having the scheme shown in Figure 17 has been implemented. It will be used for various applications including human-robot interactions.