SciELO - Scientific Electronic Library Online

 
 número47Automatic WordNet Construction Using Markov Chain Monte CarloTopicSearch-Personalized Web Clustering Engine Using Semantic Query Expansion, Memetic Algorithms and Intelligent Agents índice de autoresíndice de assuntospesquisa de artigos
Home Pagelista alfabética de periódicos  

Serviços Personalizados

Journal

Artigo

Indicadores

Links relacionados

  • Não possue artigos similaresSimilares em SciELO

Compartilhar


Polibits

versão On-line ISSN 1870-9044

Resumo

GU, Yanhui; YANG, Zhenglu; NAKANO, Miyuki  e  KITSUREGAWA, Masaru. Exploration on Effectiveness and Efficiency of Similar Sentence Matching. Polibits [online]. 2013, n.47, pp.23-29. ISSN 1870-9044.

Similar sentence matching is an essential issue for many applications, such as text summarization, image extraction, social media retrieval, question-answer model, and so on. A number of studies have investigated this issue in recent years. Most of such techniques focus on effectiveness issues but only a few focus on efficiency issues. In this paper, we address both effectiveness and efficiency in the sentence similarity matching. For a given sentence collection, we determine how to effectively and efficiently identify the top-k semantically similar sentences to a query. To achieve this goal, we first study several representative sentence similarity measurement strategies, based on which we deliberately choose the optimal ones through cross-validation and dynamically weight tuning. The experimental evaluation demonstrates the effectiveness of our strategy. Moreover, from the efficiency aspect, we introduce several optimization techniques to improve the performance of the similarity computation. The trade-off between the effectiveness and efficiency is further explored by conducting extensive experiments.

Palavras-chave : String matching; information retrieval; natural language processing.

        · texto em Inglês

 

Creative Commons License Todo o conteúdo deste periódico, exceto onde está identificado, está licenciado sob uma Licença Creative Commons