Services on Demand
Journal
Article
Indicators
Cited by SciELO
Access statistics
Related links
Similars in SciELO
Share
Computación y Sistemas
On-line version ISSN 2007-9737Print version ISSN 1405-5546
Abstract
HUETLE FIGUEROA, Juan; PEREZ TELLEZ, Fernando and PINTO, David. On Detecting Keywords for Concept Mapping in Plain Text. Comp. y Sist. [online]. 2020, vol.24, n.2, pp.651-668. Epub Oct 04, 2021. ISSN 2007-9737. https://doi.org/10.13053/cys-24-2-3400.
The key terminology is very important for scientific works, especially for Natural Language Processing field. However, there is no optimal way to extract all the key terminology in a reliable manner. Thereby it is important to develop automatic methods for extracting key terms. This document presents a way to obtain the key terminology based on labels that were manually obtained by an expert in the area. Subsequently, we got POS (Part-of-the-speech) tags for each label, in which we obtained patterns from key terminology that were used as filters afterwards. Experiment 1 was tested using the labels obtained manually and the labels obtained by the proposed approach, with 60% of the corpus for training and 40% for tests. The patterns were evaluated with three different measures of evaluation such as precision, recall, and F-measure. Experiment 2 used three measures for ranking N-grams (sequence of terms), Point mutual information, Likelihood-ratio, and Chi-square. To obtain the best N-grams, we have implemented in experiment 3 intersections between the previous measures and filtering N-grams by POS patterns. Also, they were compared with the manually labeled set, evaluation measures were used to see its result, gave us a good recall moreover acceptable precision and F-measure. In experiment 4 POS patterns were tested in a much larger corpus of a different domain obtaining slightly higher results.
Keywords : Collocations; n-gramas; POS; keyword extraction.
![](/img/en/iconPDFDocument.gif)