Computación y Sistemas
versión On-line ISSN 2007-9737versión impresa ISSN 1405-5546
Comp. y Sist. vol.18 no.3 Ciudad de México jul./sep. 2014
Artículos regulares
Multi-document Summarization using Tensor Decomposition
Marina Litvak and Natalia Vanetik
Shamoon College of Engineering, Beer Sheva, Israel.,
Article received on 31/12/2013.
Accepted on 12/02/2014.
The problem of extractive text summarization for a collection of documents is defined as selecting a small subset of sentences so the contents and meaning of the original document set are preserved in the best possible way. In this paper we present a new model for the problem of extractive summarization, where we strive to obtain a summary that preserves the information coverage as much as possible, when compared to the original document set. We construct a new tensor-based representation that describes the given document set in terms of its topics. We then rank topics via Tensor Decomposition, and compile a summary from the sentences of the highest ranked topics.
Keywords: Tensor decomposition, multilingual multi-focument summarization.
Authors are grateful to Igor Vinokur for the plugin implementation and technical support.
