SciELO - Scientific Electronic Library Online

 
vol.17 número2Inferencia y reconciliación en la red léxica y semántica basada en la técnica crowdsourcedOptimización de selección de soluciones de evaluación para completar los resultados de recuperación de información índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

  • No hay artículos similaresSimilares en SciELO

Compartir


Computación y Sistemas

versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546

Resumen

DUBEY, Ajay  y  VARMA, Vasudeva. Generation of Bilingual Dictionaries using Structural Properties. Comp. y Sist. [online]. 2013, vol.17, n.2, pp.161-168. ISSN 2007-9737.

Building bilingual dictionaries from Wikipedia has been extensively studied in the area of computation linguistics. These dictionaries play a crucial role in Natural Language Processing(NLP) applications like Cross-Lingual Information Retrieval, Machine Translation and Named Entity Recognition. To build these dictionaries, most of the existing approaches use information present in Wikipedia titles, info-boxes and categories. Interestingly, not many use the structural properties of a document like sections, subsections, etc. In this work we exploit the structural properties of documents to build a bilingual English-Hindi dictionary. The main intuition behind this approach is that documents in different languages discussing the same topic are likely to have similar structural elements. Though we present our experiments only for Hindi, our approach is language independent and can be easily extended to other languages. The major contribution of our work is that the dictionary contains translation and transliteration of words which include Named Entities to a large extent. We evaluate our dictionary using manually computed precision. We generated a massive list of 72k tokens using our approach with 0.75 precision.

Palabras llave : Bilingual dictionary; comparable corpora; structural elements.

        · resumen en Español     · texto en Inglés

 

Creative Commons License Todo el contenido de esta revista, excepto dónde está identificado, está bajo una Licencia Creative Commons