SciELO - Scientific Electronic Library Online

 
vol.25 número3Wardp HMM: A Shilling Attack Detection Technique Using Wardp Method and Hidden Markov ModelLexicographic Study of Synonymy: Clarifying Semantic Similarity between Words índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Journal

Artigo

Indicadores

Links relacionados

  • Não possue artigos similaresSimilares em SciELO

Compartilhar


Computación y Sistemas

versão On-line ISSN 2007-9737versão impressa ISSN 1405-5546

Resumo

CASTILLO VELASQUEZ, Francisco Antonio et al. Author Gender Identification for Short Texts. Comp. y Sist. [online]. 2021, vol.25, n.3, pp.659-665.  Epub 13-Dez-2021. ISSN 2007-9737.  https://doi.org/10.13053/cys-25-3-3999.

At present, the possibility of communicating or expressing oneself through an electronic medium is very wide: most users of computers and mobile devices use email, social networks, chats and other tools. One of the problems that has arisen with this form of communication is excess, such as plagiarism, false identity, intimidating notes, and others. The attribution of authorship of texts (AAT) is responsible for answering the question of who is the author of a text, giving some previous examples of that author (training set). A useful process within the AAT is the identification of gender or sex (male, female) and that has been studied by several authors, but mainly for English. The present work proposes a computational model based on lexical characteristics (n-grams) for the identification of the genre for short texts in Spanish. Tests were made with a corpus of text messages on social networks and blogs, obtaining promising results.

Palavras-chave : Gender identification; machine-learning; n-grams; classification; authorship.

        · resumo em Espanhol     · texto em Espanhol