ArA*summarizer: An Arabic text summarization system based on subtopic segmentation and using an A* algorithm for reduction

Thumbnail Image
Journal Title
Journal ISSN
Volume Title
Automatic text summarization is a field situated at the intersection of natural language processing and information retrieval. Its main objective is to automatically produce a condensed representative form of documents. This paper presents ArA*summarizer, an automatic system for Arabic single document summarization. The system is based on an unsupervised hybrid approach that combines statistical, cluster-based, and graph-based techniques. The main idea is to divide text into subtopics then select the most relevant sentences in the most relevant subtopics. The selection process is done by an A* algorithm executed on a graph representing the different lexical–semantic relationships between sentences. Experimentation is conducted on Essex Arabic summaries corpus and using recall-oriented understudy for gisting evaluation, automatic summarization engineering, merged model graphs, and n-gram graph powered evaluation via regression evaluation metrics. The evaluation results showed the good performance of our system compared with existing works.
Automatic system for Arabic single-document summarization, Natural language processing, Data-driven, Graph theory, Information extraction, Text mining, Topic identification
Expert Systems, Vol. 37, N° 2, Avril 2020