SciELO - Scientific Electronic Library Online

 
vol.23 número3Identification of POS Tag for Khasi Language based on Hidden Markov Model POS TaggerIdentifying Repeated Sections within Documents índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Journal

Artigo

Indicadores

Links relacionados

  • Não possue artigos similaresSimilares em SciELO

Compartilhar


Computación y Sistemas

versão On-line ISSN 2007-9737versão impressa ISSN 1405-5546

Resumo

PATHAK, Amarnath; DAS, Ranjita; PAKRAY, Partha  e  GELBUKH, Alexander. Extracting Context of Math Formulae Contained inside Scientific Documents. Comp. y Sist. [online]. 2019, vol.23, n.3, pp.803-818.  Epub 09-Ago-2021. ISSN 2007-9737.  https://doi.org/10.13053/cys-23-3-3246.

A math formula present inside a scientific document is often preceded by its textual description, which is commonly referred to as the context of formula. Annotating context to the formula enriches its semantics, and consequently impacts the retrieval of mathematical contents from scientific documents. Also, with a considerable surety, a context can be assumed to be one of the Noun Phrases (NPs) of the sentence in which formula occurs. However, the presence of several different misleading NPs in the sentence necessitates extraction of an NP, which is more precise to the formula than the rest. Although a fair number of methods are developed for precise context extraction, it can be fascinating to prospect other competent techniques which can further their performances. To this end, this paper discusses implementation of an automated context extraction system, which follows certain heuristics in assigning weights to different candidate NPs, and tune those weights using a development set comprising annotated formulae. The implemented system significantly outperforms nearest noun and sentence-pattern based methods on the ground of F-score.

Palavras-chave : Context extraction; math information retrieval; NTCIR; parser; noun phrase.

        · texto em Inglês     · Inglês ( pdf )