Minimum redundancy and maximum relevance for single and multi-document Arabic text summarization

Oufaida, Houda; Nouali, Omar; Blache, Philippe

Minimum redundancy and maximum relevance for single and multi-document Arabic text summarization

Date

2014-12

Authors

Oufaida, Houda

Nouali, Omar

Blache, Philippe

Publisher

Elsevier

Abstract

Automatic text summarization aims to produce summaries for one or more texts using machine techniques. In this paper, we propose a novel statistical summarization system for Arabic texts. Our system uses a clustering algorithm and an adapted discriminant analysis method: mRMR (minimum redundancy and maximum relevance) to score terms. Through mRMR analysis, terms are ranked according to their discriminant and coverage power. Second, we propose a novel sentence extraction algorithm which selects sentences with top ranked terms and maximum diversity. Our system uses minimal language-dependant processing: sentence splitting, tokenization and root extraction. Experimental results on EASC and TAC 2011 MultiLingual datasets showed that our proposed approach is competitive to the state of the art systems.

Keywords

Arabic text summarization, Sentence extraction, mRMR, Minimum redundancy, Maximum relevance

URI

https://dl.cerist.dz/handle/CERIST/982

Collections

International Journal Papers

Full item page

Minimum redundancy and maximum relevance for single and multi-document Arabic text summarization

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

URI

Collections

Endorsement

Review

Supplemented By

Referenced By