Exploiting Category-Specific Information for Multi-Document Summarization
A PHP Error was encountered
Message: Undefined index: id
Line Number: 218
We show that by making use of information common to document sets belonging to a common category, we can improve the quality of automatically extracted content in multi-document summaries. This simple property is widely applicable in multi-document summarization tasks, and can be encapsulated by the concept of category-specific importance (CSI). Our experiments show that CSI is a valuable metric to aid sentence selection in extractive summarization tasks. We operationalize the computation CSI of sentences through the introduction of two new features that can be computed without needing any external knowledge. We also generalize this approach, showing that when manually-curated document-to-category mappings are unavailable, performing automatic categorization of document sets also improves summarization performance. We have incorporated these features into a simple, freely available, open-source extractive summarization system, called SWING. In the recent TAC-2011 guided summarization task, SWING outperformed all other participant summarization systems as measured by automated ROUGE measures.