Advanced Search

Journal Navigation

Journal Home

Subscriptions

Archive

Contact Us

Table of Contents

Sign In to gain access to subscriptions and/or personal tools.
Journal of Information Science
This Article
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
0165551507084630v1
34/3/308    most recent
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to Saved Citations
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Right arrow Add to My Marked Citations
Citing Articles
Right arrow Citing Articles via Google Scholar
Right arrow Citing Articles via Scopus
Google Scholar
Right arrow Articles by Shiyan Ou
Right arrow Articles by Goh, D. H.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Complore   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati   Add to Twitter  
What's this?

Design and development of a concept-based multi-document summarization system for research abstracts

Shiyan Ou

Division of Information Studies, School of Communication and Information, Nanyang Technological University, Singapore, Shiyan.Ou{at}wlv.ac.uk

Christopher Soo-Guan Khoo

Division of Information Studies, School of Communication and Information, Nanyang Technological University, Singapore

Dion H. Goh

Division of Information Studies, School of Communication and Information, Nanyang Technological University, Singapore

This paper describes a new concept-based multi-document summarization system that employs discourse parsing, information extraction and information integration. Dissertation abstracts in the field of sociology were selected as sample documents for this study. The summarization process includes four major steps — (1) parsing dissertation abstracts into five standard sections; (2) extracting research concepts (often operationalized as research variables) and their relationships, the research methods used and the contextual relations from specific sections of the text; (3) integrating similar concepts and relationships across different abstracts; and (4) combining and organizing the different kinds of information using a variable-based framework, and presenting them in an interactive web-based interface. The accuracy of each summarization step was evaluated by comparing the system-generated output against human coding. The user evaluation carried out in the study indicated that the majority of subjects (70%) preferred the concept-based summaries generated using the system to the sentence-based summaries generated using traditional sentence extraction techniques.

Key Words: discourse parsing • information extraction • information integration • multi-document summarization

This version was published on June 1, 2008

Journal of Information Science, Vol. 34, No. 3, 308-326 (2008)
DOI: 10.1177/0165551507084630


Add to CiteULike CiteULike   Add to Complore Complore   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati   Add to Twitter Twitter    What's this?