SciELO - Scientific Electronic Library Online

 
vol.15 número2Recuperación de documentos árabes antiguos a partir de imágenes sin usar reconocimiento de caracteresEtiquetación de emociones a nivel de documento: aprendizaje automático y un método basado en recursos índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Journal

Artigo

Indicadores

Links relacionados

  • Não possue artigos similaresSimilares em SciELO

Compartilhar


Computación y Sistemas

versão On-line ISSN 2007-9737versão impressa ISSN 1405-5546

Resumo

PINTO, David et al. Evaluating n-gram Models for a Bilingual Word Sense Disambiguation Task. Comp. y Sist. [online]. 2011, vol.15, n.2, pp.209-220. ISSN 2007-9737.

The problem of Word Sense Disambiguation (WSD) is about selecting the correct sense of an ambiguous word in a given context. However, even if the problem of WSD is difficult, when we consider its bilingual version, this problem becomes much more complex. In this case, it is necessary not only to find the correct translation, but such translation must consider the contextual senses of the original sentence (in the source language), in order to find the correct sense (in the target language) of the source word. In this paper we present a probabilistic model for bilingual WSD based on n-grams (2-grams, 3-grams, 5-grams and k-grams, for a sentence S of a length k). The aim is to analyze the behavior of the system with different representations of a given sentence containing an ambiguous word. We use a Naïve Bayes classifier for determining the probability of the target sense (in the target language) given a sentence which contains an ambiguous word (in the source language). For this purpose, we use a bilingual statistical dictionary, which is calculated with Giza++ by using the EUROPARL parallel corpus. On the average, the representation model based on 5-grams with mutual information demonstrated the best performance.

Palavras-chave : Bilingual word sense disambiguation; machine translation; parallel corpus; Naïve Bayes classifier.

        · resumo em Espanhol     · texto em Inglês

 

Creative Commons License Todo o conteúdo deste periódico, exceto onde está identificado, está licenciado sob uma Licença Creative Commons