SciELO - Scientific Electronic Library Online

 
vol.15 issue3A comparative study of the use of local directional pattern for texture-based informal settlement classificationNanocolumnar CdS thin films grown by glancing angle deposition from a sublimate vapor effusion source author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • Have no similar articlesSimilars in SciELO

Share


Journal of applied research and technology

On-line version ISSN 2448-6736Print version ISSN 1665-6423

Abstract

HERNANDEZ-MENA, Carlos Daniel; MEZA-RUIZ, Ivan V.  and  HERRERA-CAMACHO, José Abel. Automatic speech recognizers for Mexican Spanish and its open resources. J. appl. res. technol [online]. 2017, vol.15, n.3, pp.259-270. ISSN 2448-6736.  https://doi.org/10.1016/j.jart.2017.02.001.

Development of automatic speech recognition systems relies on the availability of distinct language resources such as speech recordings, pronunciation dictionaries, and language models. These resources are scarce for the Mexican Spanish dialect. In this work, we present a revision ofthe CIEMPIESS corpus that is a resource for spontaneous speech recognition in Mexican Spanish of Central Mexico. It consists of 17 h of segmented and transcribed recordings, a phonetic dictionary composed by 53,169 unique words, and a language model composed by 1,505,491 words extracted from 2489 university news letters. We also evaluate the CIEMPIESS corpus using three well known state of the art speech recognition engines, having satisfactory results. These resources are open for research and development in the field. Additionally, we present the methodology and the tools used to facilitate the creation of these resources which can be easily adapted to other variants of Spanish, or even other languages.

Keywords : Automatic speech recognition; Mexican Spanish; Language resources; Language model; Acoustic model.

        · text in English     · English ( pdf )