SciELO - Scientific Electronic Library Online

 
vol.25 issue4A Novel Hybrid Grey Wolf Optimization Algorithm Using Two-Phase Crossover Approach for Feature Selection and ClassificationEstimating Volume of the Tomato Fruit by 3D Reconstruction Technique author indexsubject indexsearch form
Home Pagealphabetic serial listing  

Services on Demand

Journal

Article

Indicators

Related links

  • Have no similar articlesSimilars in SciELO

Share


Computación y Sistemas

On-line version ISSN 2007-9737Print version ISSN 1405-5546

Abstract

NUNSANGA, Morrel V. L.; PAKRAY, Partha; LALLAWMSANGA, C.  and  SINGH, L. Lolit Kumar. Part-of-Speech Tagging for Mizo Language Using Conditional Random Field. Comp. y Sist. [online]. 2021, vol.25, n.4, pp.803-812.  Epub Feb 28, 2022. ISSN 2007-9737.  https://doi.org/10.13053/cys-25-4-4044.

Part of speech (POS) tagging assigns a class or tag to each token in a sentence. The tag allocated to a word is mainly its part of speech or any other class of interest. Several applications of Natural Language Processing (NLP) require it as a prerequisite. The development of part-of-speech tagging for the under-resourced Mizo language is presented in this study, which makes use of a stochastic model known as Conditional Random Field (CRF). The CRF is a discriminative probabilistic classifier that considers both the context of a given word and the tag transition probabilities in the training dataset. A corpus of approximately 30,000 words was collected and manually annotated with the proposed tagset for system evaluation. On various sizes of training and test sets, the tagger achieved 89.46 % accuracy, 89.3 % F1-score, 89.42 % precision, and 89.48 % recall.

Keywords : Mizo POS tagging; conditional random field; Mizo part of speech tagger; computational linguistics.

        · text in English     · English ( pdf )