SciELO - Scientific Electronic Library Online

 
 número47Automatic WordNet Construction Using Markov Chain Monte CarloTopicSearch-Personalized Web Clustering Engine Using Semantic Query Expansion, Memetic Algorithms and Intelligent Agents índice de autoresíndice de materiabúsqueda de artículos
Home Pagelista alfabética de revistas  

Servicios Personalizados

Revista

Articulo

Indicadores

Links relacionados

  • No hay artículos similaresSimilares en SciELO

Compartir


Polibits

versión On-line ISSN 1870-9044

Resumen

GU, Yanhui; YANG, Zhenglu; NAKANO, Miyuki  y  KITSUREGAWA, Masaru. Exploration on Effectiveness and Efficiency of Similar Sentence Matching. Polibits [online]. 2013, n.47, pp.23-29. ISSN 1870-9044.

Similar sentence matching is an essential issue for many applications, such as text summarization, image extraction, social media retrieval, question-answer model, and so on. A number of studies have investigated this issue in recent years. Most of such techniques focus on effectiveness issues but only a few focus on efficiency issues. In this paper, we address both effectiveness and efficiency in the sentence similarity matching. For a given sentence collection, we determine how to effectively and efficiently identify the top-k semantically similar sentences to a query. To achieve this goal, we first study several representative sentence similarity measurement strategies, based on which we deliberately choose the optimal ones through cross-validation and dynamically weight tuning. The experimental evaluation demonstrates the effectiveness of our strategy. Moreover, from the efficiency aspect, we introduce several optimization techniques to improve the performance of the similarity computation. The trade-off between the effectiveness and efficiency is further explored by conducting extensive experiments.

Palabras llave : String matching; information retrieval; natural language processing.

        · texto en Inglés

 

Creative Commons License Todo el contenido de esta revista, excepto dónde está identificado, está bajo una Licencia Creative Commons