Servicios Personalizados
Revista
Articulo
Indicadores
- Citado por SciELO
- Accesos
Links relacionados
- Similares en SciELO
Compartir
Computación y Sistemas
versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546
Comp. y Sist. vol.11 no.3 Ciudad de México ene./mar. 2008
Resumen de tesis doctoral
Recuperación de Información con Resolución de Ambigüedad de Sentidos de Palabras para el Español
Information Retrieval with Word Sense Disambiguation for Spanish
Graduated: Yoel Ledo Mezquita
Centro de Investigación en Computación (CICIPN)
Av. Juan de Dios Bátiz sn esq. Miguel Othón de Mendizábal
C. P. 07738 México D. F.,
email: yledo@yahoo.com
Advisor: Grigori Sidorov
Centro de Investigación en Computación (CICIPN)
Av. Juan de Dios Bátiz sn esq. Miguel Othón de Mendizábal
C. P. 07738 México D. F.,
www.cic.ipn.mx/~sidorov
Advisor: Alexander Gelbukh
Centro de Investigación en Computación (CICIPN)
Av. Juan de Dios Bátiz sn esq. Miguel Othón de Mendizábal
C. P. 07738 México D. F.,
www.cic.ipn.mx/~sidorov
Graduated on June 23, 2006
Resumen
Uno de los problemas en los portales de recuperación de información en Internet (los portales dinámicos de Altavista, Google, Yahoo, etc.) y en bibliotecas digitales (Biblioteca del Congreso de los EE.UU., etc.) es el de brindar diversas respuestas con muy baja pertinencia. Por ejemplo, un mecánico de autos busca "¿dónde comprar un gato?" y obtiene respuestas sobre los "gatos monteses", "gatos siameses", y otros. Un comerciante de frutas busca "producción de lima" y obtiene respuestas sobre la "ciudad de Lima", "jugo de lima", "lima de uñas", y otros. Estas imprecisiones son debidas a los distintos sentidos que tienen las palabras, lo cual se le conoce como Desambiguación del Sentido de las Palabras (Word Sense Disambiguation, WSD, del inglés.) Este término, es un mecanismo lingüístico para definir el sentido correcto de una palabra, basándose en el contexto donde se emplee, en función de sus posibles sentidos semánticos. Las aportaciones de este artículo consisten en el desarrollo de un nuevo método de desambiguación de sentidos de palabras usando grandes recursos léxicos (diccionarios explicativos, diccionarios de sinónimos, WordNet).
Palabras clave: sentidos de palabras, contexto, diccionarios, algoritmo de Lesk.
Abstract
One of the problems of information retrieval in Internet and digital libraries is low precision: a high number of retrieved documents of low relevance. For example, a person looks for information about jaguars (the animal) and the documents retrieved are about the model of a car. This problem arises due to ambiguity of different senses of words. The task of determining the correct interpretation of a word in its context is known as Word Sense Disambiguation (WSD) task. It employs a linguistic mechanism that detects the most suitable sense of a word, according to the context where the word is used, choosing of its possible senses. In this paper, a new method for word senses disambiguation is proposed based on additional linguistic information for the words in the context available from the large lexical resources, like explanatory dictinary, synonym dictionary, WordNet.
Keywords: word senses, context, dictionaries, Lesk algorithm.
DESCARGAR ARTÍCULO EN FORMATO PDF
Referencias
1. Aguirre, E. and G. Rigau (1996). Word Sense Disambiguation using Conceptual Density. Proc. 16th international conference on COLING. Copenhangen. [ Links ]
2. BaezaYates, R. and B. RibeiroNeto (1999). Modern Information Retrieval. AddisonWesley. [ Links ]
3. Bolshakov, I. and A. Gelbukh (2004). Computational Linguistics: Models, Resources, Applications. IPN UNAM Fondo de Cultura Económica, Mexico, 186 p. [ Links ]
4. Campos, L. M. de (2001). Un modelo de recuperación de información basado en redes bayesianas. Universidad de Granada, España. [ Links ]
5. Dolan, W., L. Vanderwende, and S. Richardson (2000). Polysemy in a BroadCoverage Natural Language Processing System. In Polysemy: Theoretical and Computational Approaches. Ravin Yael and Leacock Claudia (ed.). Oxford University Press. New York. 178204. [ Links ]
6. Ghazfan, (1996). Toward meaningful Bayesian networks for information retrieval systems. In Proceedings of the IPMU'96 Conference, pages 841846. [ Links ]
7. Lesk, M. (1986). Automated Sense Disambiguation Using Machinereadable Dictionaries: How to Tell a Pine Cone from an Ice Cream Cone. In: Proceedings of the 1986 SIGDOC Conference, Toronto, Canada, June 1986, 2426. [ Links ]
8. Manning, C. and H. Schütze (1999) Foundations of Statistical Natural Language Processing. MIT Press. [ Links ]
9. Cowie, J., L. Guthrie, and G. Guthrie (1992). Lexical disambiguation using simulated annealing. Proceedings of Coling92, Nante, France, pp. 359365. [ Links ]
10. Global Reach (2002). http://globalreach.biz [ Links ]
11. McHale, M. L. (1997). A comparison of WordNet and Roget's taxonomy for measuring semantic similarity. [ Links ]
12. Lawrence, S. (2000). El Acceso a la Información en la Web Limitado y Desigual. NEC Research Institute, http://www.neci.nec.com/ [ Links ]
13. McRoy, S. (1992). Using multiple knowledge sources for word sense disambiguation. Computational Linguistics, Vol. 18(1), pp. 130. [ Links ]
14. Mihalcea, R. and D. Moldovan (1999). A Method for word sense disambiguation of unrestricted text. Proc 37th Annual Meeting of the ACL 152158, Maryland, USA. [ Links ]
15. Montoyo, A. (2001). Método badaso en Marcas de Especificidad para WSD, Grupo de Procesamiento del Lenguaje y Sistemas de Información. Universidad de Alicante, España. [ Links ]
16. Ravin, Ya. and C. Leacock (2000). Polysemy: an overview. In Polysemy: Theoretical and Computational Approaches. Ravin Yael and Leacock Claudia (ed.). Oxford University Press. New York. 129 [ Links ]
17. Pimienta, D. (2000). Representación de las lenguas y culturas latinas en la Internet, Fundación Redes y Desarrollo. Encuentro Sociedad y Tecnología, Santiago de Chile. [ Links ]
18. Resnik, Ph. (1995). Disambiguating noun groupings with respect to WordNet senses. Proc. Third Workshop on Very Large Corpora. 5468. Cambridge, MA [ Links ]
19. Resnik, Ph. (1999). Semantic similarity in a taxonomy: an informationbased measure and its application to problems of ambiguity in natural language. In Journal of Artificial Intelligence Research 11. 95130. [ Links ]
20. Ribeiro, B. (1996). A belief network model for IR. In Proceedings of the 19th Annual International ACMSIGIR Conference on Research and Development in Information Retrieval. SIGIR'96, August 1822, 1996, Zurich, pages 253260. ACM [ Links ]
21. Rigau, G., J. Atserias and E. Aguirre (1997). Combining Unsupervised Lexical Knowledge Methods for Word Sense Disambiguation. Proc 35th annual Meeting of the ACL, 4855, Madrid, Spain. [ Links ]
22. Saracevic, T. (1995). A taxonomy of values for library and information services. Rutgers University, New Brunswick. [ Links ]
23. Stetina J., S. Kurohashi and M. Nagao (1998.) General word sense disambiguation method based on full sentencial context. In Usage of WordNet in Natural Language Processing. COLINGACL Workshop, Montreal, Canada. [ Links ]
24. Sussna, M. (1993). Word sense disambiguation for freetext indexing using a massive semantic network. Proc. Second International CIKM, 6774, Airlington. [ Links ]
25. Turtle, and Croft (1990). Inference networks for document retrieval. In SIGIR'90, 13th International ACMSIGIR Conference on Research and Development in Information Retrieval, Brussels, Belgium, 57 September 1990, Proceedings, pages 124. ACM, 1990. [ Links ]
26. Voorhees, E. M. (1993). Using WordNet to disambiguate word senses for text retrieval. Proceedings of the 16th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 27 June1 July 1993, Pittsburgh, Pennsylvania, 171180. [ Links ]
27. Wiks, Y., D. Fass, C. Guo, J. McDonal, T. Plate and B. Slator (1993). Providing Machine Tractable dictionary tools. In: Semantics and the lexicon (J. Pustejowsky, Ed.), pp. 341401 [ Links ]
28. Wilks, Y. and M. Stevenson (1996). The grammar of sense: Is word sense tagging much more than part ofspeech tagging? Technical Report CS9605, University of Sheffield, Sheffield, United Kingdom. [ Links ]
29. Wilks, Y. and M. Stevenson. The grammar of sense: Is wordsense tagging much more than partofspeech tagging? Technical Report CS9605, University of Sheffield, 1996. [ Links ]
30. Wilks, Y. and M. Stevenson (1998), Word sense disambiguation using optimized combination of knowledge sources. Proceedings of ACL 36/Coling 17, 13981402. [ Links ]
31. WordNet: an electronic lexical database. (1998), C. Fellbaum (ed.), MIT, 423 p. [ Links ]
32. Yarowksy, D. (1992) Wordsense disambiguation using statistical models of Roget's categories trained on large corpora. Proceeding of Coling92, Nante, France, pp. 454460. [ Links ]