A New Phono-Articulatory Feature Representation for Language Identification in a Discriminative Framework

Núñez Cuadra, Oneisys; Calvo de Lara, José Ramón

Servicios Personalizados

Revista

Articulo

Indicadores

Citado por SciELO
Accesos

Links relacionados

Similares en SciELO

Otros
Otros

Permalink

Computación y Sistemas

versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546

Comp. y Sist. vol.15 no.1 Ciudad de México jul./sep. 2011

Artículos

A New Phono–Articulatory Feature Representation for Language Identification in a Discriminative Framework

Nueva representación de características fono–articulatorias para identificación del idioma en un marco discriminativo

Oneisys Núñez Cuadra and José Ramón Calvo de Lara

Centro de Aplicaciones de Tecnologías de Avanzada, Cuba. E–mail: oneysita@yahoo.com, jcalvo@cenatav.co.cu

Article received on March 18, 2011.
Accepted on June 30, 2011.

Abstract

State of the Art language identification methods are based on acoustic or phonetic features. Recently, phono–articulatory features have been included as a new speech characteristic that conveys language information. Authors propose a new pho–no–articulatory representation of speech in a discriminative framework to identify languages. This simple representation shows good results discriminating between English and Spanish, using a reduced training set of phono–articulatory trigrams vectors.

Keywords: Phonetic features, articulatory features, language recognition and support vector machines.

Resumen

Los sistemas de identificación de idiomas en el estado del arte se basan en características acústicas o fonéticas. Recientemente, las características fono–articulatorias han sido incluidas como una nueva caracterización del habla que contiene información sobre el idioma. Los autores proponen una nueva representación fono–articulatoria del habla usando un marco discriminativo para identificar idiomas. Esta simple representación muestra buenos resultados en la discriminación entre inglés y español, usando un reducido conjunto de entrenamiento basado en vectores de trigramas fono–articulatorios.

Palabras clave: Características fonéticas, rasgos articulatorios, el reconocimiento del lenguaje y las máquinas de vectores soporte.

DESCARGAR ARTÍCULO EN FORMATO PDF

References

1. Burges, C. J. C. (1998). A Tutorial on Support Vector Machines for Pattern Recognition, Data Mining and Knowledge Discovery, 2(2), 121–167 [ Links ]

2. Collobert, R. & Bengio, S. (2001). SVMTorch: Support vector machines for large–scale regression problems. The Journal of Machine Learning Research, 1(9), 143–160. [ Links ]

3. Glembek, O., Matejka, P., Burget, L. & Mikolov, T. (2008). Advances in Phonotactic Language Recognition. Interspeech 2008, Brisbane, Australia, 743–746 [ Links ]

4. International Phonetic Association (1999). Handbook of the International Phonetic Association. A guide to the use of the International Phonetic Alphabet. Cambridge, U.K.: Cambridge University Press. [ Links ]

5. Kanokphara, S. & Carson–Berndsen, J. (2006). Articulatory–Acoustic–Feature–based Automatic Language Identification. ISCA Workshop on Mul–tiingual Speech and Language Processing (MULTILING 2006), Stellenbosch, South Africa. [ Links ]

6. Kirchhoff, K. ( 1999). Robust Speech Recognition Using Articulatory Information. Ph.D. Thesis, Universitat Bielefeld, Bielefeld, Germany. [ Links ]

7. Kirchhoff, K., Fink, G. A. & Sagerer, G (2002). Combining acoustic and articulatory feature information for robust speech recognition. Speech Communication, 37 (3–4) 303–319. [ Links ]

8. Muthusamy, Y. K., Cole, R. A. & Oshika, B. T. (1992). The OGI multilanguage telephone speech corpus. International Conference on Spoken Language Processing, Alberta, Canada, 895–898. [ Links ]

9. Parandekar, S. & Kirchhoff, K. (2003). Multi–stream language identification using data–driven dependency selection. 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, Hong Kong, China, I–28– I–31 [ Links ]

10. Singer, E., Torres–Carrasquillo, P.A., Gleason, T.P., Campbell, W.M. & Reynolds, D.A. (2003). Acoustic, Phonetic, and Discriminative Approaches to Automatic Language Identification.8^th European Conference on Speech Communication and Technology (EUROSPEECH 2003), Geneva, Switzerland, 1345–1348 [ Links ]

11. Stüker, S., Metze, F., Schultz, T. & Waibel, A. (2003). Integrating Multilingual Articulatory Features into Speech Recognition. 8^th European Conference on Speech Communication and Technology (EUROSPEECH 2003). Geneva, Switzerland, 1033–1036 [ Links ]

12. The 2009 NIST Language Recognition Evaluation (August 11, 2009) Retrieved from http://www.itl.nist.gov/iad/mig/7tests/lre/2009/lre09_eval_results/index.html. [ Links ]

13. Torres–Carrasquillo, P. A., Singer, E., Kohler, M.A., Greene, R.J., Reynolds, D.A. & Deller Jr., J. R. (2002). Approaches to Language Identification Using Gaussian Mixture Models and Shifted Delta Cepstral Features. 7^th International Conference on Spoken Language Processing, Denver, CO, ISCA, 89–92. [ Links ]

14. Wrench, A. (1999). MOCHA–TIMIT Retrieved from http://www.cstr.ed.ac.uk/research/projects/artic/mocha.html. [ Links ]