SciELO - Scientific Electronic Library Online

 
 issue43Knowledge Expansion of a Statistical Machine Translation System using Morphological ResourcesA Cross-Lingual Pattern Retrieval Framework author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • Have no similar articlesSimilars in SciELO

Share


Polibits

On-line version ISSN 1870-9044

Abstract

LIM, Lian Tze; RANAIVO-MALANCON, Bali  and  TANG, Enya Kong. Low Cost Construction of a Multilingual Lexicon from Bilingual Lists. Polibits [online]. 2011, n.43, pp.45-51. ISSN 1870-9044.

Manually constructing multilingual translation lexicons can be very costly, both in terms of time and human effort. Although there have been many efforts at (semi-)automatically merging bilingual machine readable dictionaries to produce a multilingual lexicon, most of these approaches place quite specific requirements on the input bilingual resources. Unfortunately, not all bilingual dictionaries fulfil these criteria, especially in the case of under-resourced language pairs. We describe a low cost method for constructing a multilingual lexicon using only simple lists of bilingual translation mappings. The method is especially suitable for under-resourced language pairs, as such bilingual resources are often freely available and easily obtainable from the Internet, or digitised from simple, conventional paper-based dictionaries. The precision of random samples of the resultant multilingual lexicon is around 0.70-0.82, while coverage for each language, precision and recall can be controlled by varying threshold values. Given the very simple input resources, our results are encouraging, especially in incorporating under-resourced languages into multilingual lexical resources.

Keywords : Lexical resources; multilingual lexicon; under-resourced languages.

        · text in English

 

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License