Services on Demand
Journal
Article
Indicators
- Cited by SciELO
- Access statistics
Related links
- Similars in SciELO
Share
Computación y Sistemas
On-line version ISSN 2007-9737Print version ISSN 1405-5546
Comp. y Sist. vol.18 n.3 Ciudad de México Jul./Sep. 2014
https://doi.org/10.13053/CyS-18-3-2034
Artículos regulares
Entity Extraction in Biochemical Text using Multiobjective Optimization
Utpal Kumar Sikdar, Asif Ekbal, and Sriparna Saha
Department of Computer Science and Engineering, Indian Institute of Technology, Patna, India. utpal.sikdar@iitp.ac.in, asif@iitp.ac.in, sriparna@iitp.ac.in.
Article received on 18/01/2014.
Accepted on 01/02/2014.
Abstract
In this paper we propose a multiobjective modified differential evolution based feature selection and classifier ensemble approach for biochemical entity extraction. The algorithm performs in two layers. The first layer concerns with determining an appropriate set of features for the task within the framework of a supervised statistical classifier, namely, Conditional Random Field (CRF). This produces a set of solutions, a subset of which is used to construct an ensemble in the second layer. The proposed approach is evaluated for entity extraction in chemical texts, which involves identification of IUPAC and IUPAC-like names and classification of them into some predefined categories. Experiments that were carried out on a benchmark dataset show the recall, precision and F-measure values of 86.15%, 91.29% and 88.64%, respectively.
Keywords: Multiobjective modified differential evolution (MODE), feature selection, ensemble learning, conditional random field (CRF), named entity (NE).
DESCARGAR ARTÍCULO EN FORMATO PDF
References
1. Ekbal, A. & Saha, S. (2010). Classifier ensemble selection using genetic algorithm for named entity recognition. Research on Language and Computation, 8, 73-99. [ Links ]
2. Ekbal, A. & Saha, S. (2010). Weighted vote based classifier ensemble selection using genetic algorithm for named entity recognition. In Proceedings of the Natural language processing and information systems, NLDB'10, pp. 256-267. [ Links ]
3. Ekbal, A. & Saha, S. (2011). Weighted vote-based classifier ensemble for named entity recognition: A genetic algorithm-based approach. ACM Trans. Asian Lang. Inf. Process., 10(2). [ Links ]
4. Ekbal, A. & Saha, S. (2012). Multiobjective optimization for classifier ensemble and feature selection: an application to named entity recognition. IJDAR, 15(2), 143-166. [ Links ]
5. Lafferty, J. D., McCallum, A., & Pereira, F. C. N. (2001). Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. In ICML, pp. 282-289.
6. Liu, H. & Motoda, H. (1998). Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic Publishers, Norwell, MA, USA. [ Links ]
7. Liu, H. & Yu, L. (2005). Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. on Knowl. and Data Eng., 17(4), 491-502. doi: http://dx.doi.org/10.1109/TKDE.2005.66. [ Links ]
8. Sikdar, U. K., Ekbal, A., & Saha, S. (2012). Differential evolution based feature selection and classifier ensemble for named entity recognition. In COLING, pp. 2475-2490.
9. Sikdar, U. K., Ekbal, A., & Saha, S. (2014). Modified differential evolution for biochemical name recognizer. In CICLing, pp. 225-236.
10. Storn, R. & Price, K. (1997). Differential evolution a simple and efficient heuristic for global optimization over continuous spaces. J. of Global Optimization, 11(4), 341-359. doi: 10.1023/A:1008202821328. [ Links ]