SciELO - Scientific Electronic Library Online

 
 número43Assesing the Feature-Driven Nature of Similarity-based Sorting of Verbs índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Journal

Artigo

Indicadores

Links relacionados

  • Não possue artigos similaresSimilares em SciELO

Compartilhar


Polibits

versão On-line ISSN 1870-9044

Resumo

POULARD, Fabien; HERNANDEZ, Nicolás  e  DAILLE, Béatrice. Detecting Derivatives using Specific and Invariant Descriptors. Polibits [online]. 2011, n.43, pp.7-13. ISSN 1870-9044.

This paper explores the detection of derivation links between texts (otherwise called plagiarism, near-duplication, revision, etc.) at the document level. We evaluate the use of textual elements implementing the ideas of specificity and invariance as well as their combination to characterize derivatives. We built a French press corpus based on Wikinews revisions to run this evaluation. We obtain performances similar to the state of the art method (n-grams overlap) while reducing the signature size and so, the processing costs. In order to ensure the verifiability and the reproducibility of our results we make our code as well as our corpus available to the community.

Palavras-chave : Textual derivatives; detection of derivations; near-duplicates; revisions; linguistic descriptors; French corpus.

        · texto em Inglês

 

Creative Commons License Todo o conteúdo deste periódico, exceto onde está identificado, está licenciado sob uma Licença Creative Commons