SciELO - Scientific Electronic Library Online

 
vol.18 número3Formal Description of Arabic Syntactic Structure in the Framework of the Government and Binding TheoryTowards the Automatic Recommendation of Musical Parameters based on Algorithm for Extraction of Linguistic Rules índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

  • No hay artículos similaresSimilares en SciELO

Compartir


Computación y Sistemas

versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546

Comp. y Sist. vol.18 no.3 Ciudad de México jul./sep. 2014

https://doi.org/10.13053/CyS-18-3-2041 

Artículos regulares

 

Semantic Hyper-graph Based Representation of Nouns in the Kazakh Language

 

Banu Yergesh, Assel Mukanova, Altynbek Sharipbay, Gulmira Bekmanova, and Bibigul Razakhova

 

L.N. Gumilyov Eurasian National University, Astana, Kazakhstan. b.yergesh@gmail.com, asel_ms@bk.ru, sharalt@mail.ru, gulmira-r@yandex.ru, utalina@mail.ru.

 

Article received on 07/07/2014.
Accepted 10/08/2014.

 

Abstract

We explain how semantic hyper-graphs are used to describe ontological models of morphological rules of agglutinative languages, with the Kazakh language as a case study. The vertices of these graphs represent morphological features and the edges represent relationships between these features. Such modeling allows nearly one to one translation of the morphology of the language into object-oriented model of data. In addition, with such a model we can easily generate new word forms. The constructed model and the dictionary generated with it are freely available for research purposes.

Keywords: Ontology, morphology, semantic hyper-graph, word form generation.

 

DESCARGAR ARTÍCULO EN FORMATO PDF

 

References

1. Alexandrov, M., Blanco, X., Gelbukh, A., & Makagonov, P. (2004). Knowledge-poor Approach to Constructing Word Frequency Lists, with Examples from Romance Languages. Procesamiento de Lenguaje Natural, No 33, 127-132.         [ Links ]

2. Altenbek, G. (2010). Kazakh Segmentation system of inflectional Affixes. In Proceedings of CIPS-SIGHAN Joint Conference on Chinese Language Processing (CLP2010), Xiao-long Wang. Beijing, China, 183-190.         [ Links ]

3. Batayeva, Z. (2012). Colloquial Kazakh, Routledge.         [ Links ]

4. Bolshakov, I. A. & Gelbukh, A. (2004). Computational linguistics: Models, resources, applications, 187 pp.         [ Links ]

5. Bretto, A. (2013). Hypergraph Theory. Springer International Publishing Switzerland.         [ Links ]

6. Cinman, L., Dyachenko, P., Petrochenkov, V., & Timoshenko, S. (2013). Automatic Distinction between Natural and Automatically Generated Texts Using Morphological and Syntactic Information. International Journal of Computational Linguistics and Applications, Vol. 4, No. 2, 189-202.         [ Links ]

7. Gelbukh, A. (2000). Lazy Query Enrichment: A Method for Indexing Large Specialized Document Bases with Morphology and Concept Hierarchy. Lecture Notes in Computer Science, Vol. 1873, 526-535.         [ Links ]

8. Gelbukh, A. (2003). Exact and approximate prefix search under access locality requirements for morphological analysis and spelling correction. Computación y Sistemas, Vol. 6, No. 3, 167-182.         [ Links ]

9. Gelbukh, A., Alexandrov, M., & Han, S. (2004). Detecting Inflection Patterns in Natural Language by Minimization of Morphological Model. Lecture Notes in Computer Science, Vol. 3287, 432-438.         [ Links ]

10. Gelbukh, A. & Sidorov, G. (2002). Morphological Analysis of Inflective Languages through Generation. Procesamiento de Lenguaje Natural, No. 29, 105-112.         [ Links ]

11. Gelbukh, A. & Sidorov, G. (2003). Approach to construction of automatic morphological analysis systems for inflective languages with little effort. In Proc. of Computational Linguistics and Intelligent Text Processing, CICLing-2003, Mexico. Lecture Notes in Computer Science, Vol. 2588, 215-220.         [ Links ]

12. Gelbukh, A. & Sidorov, G. (2005). On Automatic Morphological Analysis of Inflective Languages. International Conference on Applied Linguistics Dialogue-2005, Russia, 92-96.         [ Links ]

13. Gelbukh, A., Sidorov, G., Lara-Reyes, D., & Chanona-Hernández, L. (2008). Division of Spanish Words into Morphemes with a Genetic Algorithm. Lecture Notes in Computer Science, Vol. 5039, 19-26.         [ Links ]

14. Gökgöz, E. and Kurt, A. and Kulamshaev, K. & Kara, M. (2011). Two-Level Qazan Tatar Morphology. In Proceedings of the 1st International Conference on Foreign Language Teaching and Applied Linguistics (FLTAL11), 428-432.         [ Links ]

15. Gruber, T.R. (1995). Toward Principles for the Design of Ontologies Used for Knowledge Sharing. International Journal Human-Computer Studies Vol. 43, Issues 5-6, 907-928.         [ Links ]

16. Jurafsky, D. & Martin, J. H. (2009). Speech and Language Processing: An Introduction to Natural Language Processing, Speech Recognition, and Computational Linguistics. 2nd edition. Prentice-Hall.         [ Links ]

17. Kairakbay, B. & Zaurbekov, D. (2013). Finite State Approach to the Kazakh Nominal Paradigm. In Proceedings of the 11th International Conference on Finite State Methods and Natural Language Processing, 108-112.         [ Links ]

18. Karttunen, L. (1983). KIMMO: A general morphological processor. Texas Linguistic Forum, Vol. 22, 163-186.         [ Links ]

19. Kazakh grammar. (2002). Phonetics, word formation, morphology, syntax (in Kazakh). Astana.         [ Links ]

20. Kessikbayeva, G. & Cicekli, I. (2014). Rule Based Morphological Analyzer of Kazakh Language. In Proceedings of the 2014 Joint Meeting of SIGMORPHON and SIGFSM, Baltimore, Maryland USA, 46-54.         [ Links ]

21. Khakhalin, G. (2009). Applied Ontology in the language of hypergraphs (in Russian). Proceedings of 2nd All-Russian Conference "Knowledge—Ontology—Theory" (KONT-09). Novosibirsk, 223-231.         [ Links ]

22. Koskenniemi, K. (1983). Two-level morphology: A general computational model for word-form recognition and production. Publication 11, University of Helsinki.         [ Links ]

23. Lakshmana Pandian, S. & Geetha, T.V. (2008). Morpheme based Language Model for Tamil Part-of-Speech Tagging. Polibits, Vol. 38, 19-26.         [ Links ]

24. Liu, H., LePendu, P., Jin, R., & Dou, D. (2011). A Hypergraph-based Method for Discovering Semantically Associated Itemsets. In ICDM 2011, 398-406.         [ Links ]

25. Makazhanov, A., Makhambetov, O., Sabyrgaliyev, I., & Yessenbayev, Zh. (2014) Spelling Correction for Kazakh. Computational Linguistics and Intelligent Text Processing. Lecture Notes in Computer Science, Vol. 8404, 533-541.         [ Links ]

26. Makhambetov, O., Makazhanov, A., Yessenbayev, Z., Matkarimov, B., Sabyrgaliyev, I., & Sharafudinov, A. (2013). Assembling the Kazakh Language Corpus. In EMNLP 2013, 1022-1031.         [ Links ]

27. Martins, R. (2012). Knowledge Vertices in XUNL. Polibits, Vol. 45, 61-66.         [ Links ]

28. Oflazer, K. (1994). Two-Level Description of Turkish Morphology. Literary and Linguistic Computing, Vol. 9, No. 2, 137-148.         [ Links ]

29. Potchinskii, I. (2012). Formal representation of semantic hypergraphs and their operations (in Russian). In VIII International Scientific and Practical Internet Conference "Youth. Science. Innovations".         [ Links ]

30. Lian, R., Goertzel, B., Ke, S., O'Neill, J., Sadeghi, K., Shiu, S., Wang, D., Watkins, O., & Yu G. (2012). Syntax-Semantic Mapping for General Intelligence: Language Comprehension as Hypergraph Homomorphism, Language Generation as Constraint Satisfaction. Artificial General Intelligence. Lecture Notes in Computer Science Vol. 7716, 158-167.         [ Links ]

31. Sharipbaev A.A. , Bekmanova G.T., Buribayeva A.K., Yergesh B.Z., Mukanova A.S., & Kaliyev A.K. (2012). Semantic neural network model of morphological rules of the agglutinative languages. In 6th International Conference on Soft Computing and Intelligent Systems and 13th International Symposium on Advanced Intelligence Systems, SCIS/ISIS, 1094-1099.         [ Links ]

32. Sidorov, G. & Gelbukh, A. (1999). A Hierarchy of Linguistic Programming Objects. In Proc. ENC-99, 2º Encuentro de Computación, Mexico, 12-15.         [ Links ]

33. Sulejmanov, D.Sh., Nevzorova, O.A, Gatiatullin, A.R., Gilmullin, R.A., Ayupov, M.M., & Pyatkin, N.V. (2007). The main components of the application of the grammatical model Tatar (In Russian). In Proc. Dialogue Conference 2007. Computational linguistics and intelligent technologies. Russian State Humanitarian University, 525-530.         [ Links ]

34. Tantug, A. C., Adali, E., & Oflazer, K. (2006). Computer analysis of the Turkmen language morphology. Advances in natural language processing. Lecture notes in Artificial intelligence, Vol. 4139, 186-193.         [ Links ]

35. Wawer, A. (2012). Extracting Emotive Patterns for Languages with Rich Morphology. International Journal of Computational Linguistics and Applications, Vol. 3, No. 1, 11-24.         [ Links ]

36. Zaraket, F. & Makhlouta, J. (2012). Arabic Temporal Entity Extraction using Morphological Analysis. International Journal of Computational Linguistics and Applications, Vol. 3, No. 1, 121-136.         [ Links ]

37. Zafer, H. Refit, Tilki, B., Kurt, A., & Kara, M. (2011). Two-Level Description of Kazakh Morphology. In Proceedings of the 1st International Conference on Foreign Language Teaching and Applied Linguistics (FLTAL'11), 560-564.         [ Links ]

38. Zhen L. & Jiang Z. (2010). Hy-SN: Hyper-graph based semantic network. Knowledge-Based Systems. Vol. 23, Issue 8, 809-816.         [ Links ]

Creative Commons License Todo el contenido de esta revista, excepto dónde está identificado, está bajo una Licencia Creative Commons