Afaan Oromo Automatic News Text Summarizer Based on Sentence Selection Function

Tesema, Fiseha Berhanu (2013) Afaan Oromo Automatic News Text Summarizer Based on Sentence Selection Function. Masters thesis, Addis Ababa University.

[img] PDF (Afaan Oromo Automatic News Text Summarizer Based on Sentence Selection Function)
Fiseha, Berhanu.pdf - Accepted Version
Restricted to Repository staff only

Download (2MB) | Request a copy

Abstract

The existence of the World Wide Web and advancement in digital device has caused an information explosion. Readers are overloaded with lengthy text where a shorter version would suffice. This abundance of information needs efficient tools to handle. Automatic text summarizer is one of the various tools used for the purpose of shortening lengthy documents, and alleviating the type of problem. This work focuses on developing efficient extractive Afaan Oromo automatic news text summarizer, through systematic integration of features: sentence position, keyword frequency, cue phrase, sentence length handler, occurrence of numbers and events like: - time, date and month in sentences. The data that aids for the system development are like: abbreviation, synonym, stop word, suffix, numbers, and name of: (time, date and month) collected from both secondary and primary sources. In addition, 350 English cue phrases are collected and translated to 729 Afaan Oromo cue phrases. For validation and testing 33 different newspaper topics are collected, of these, 20 of them have been used for validation while the rest 13 employed for testing purpose. The Total numbers of respondents who have participated in the validation ad testing data corpus preparation are 110. Besides, Open text summarizer C# version open source has been selected as a tool to develop the system The system has been evaluated based on seven experimental scenarios and evaluation is made both subjectively and objectively. The subjective evaluation focuses on evaluation of the structure of the summary like referential integrity and non-redundancy, coherence and informativeness of the summary. The objective evaluation uses metrics like precision, recall and F-measure for evaluation. The result of subjective evaluation is 88% informativeness, 75% referential integrity and non-redundancy, and 68% coherence. Because of the added features, different techniques and experiment applied to this work the system gave 87.47%fm and outperform by 26.95% than the previous work.

Item Type: Thesis (Masters)
Uncontrolled Keywords: Afaan Oromo, Automatic news text summarizer, Cue Phrase, Sentence Selection Function
Subjects: P Language and Literature > PL Languages and literatures of Eastern Asia, Africa, Oceania
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Z Bibliography. Library Science. Information Resources > Z665 Library Science. Information Science
Divisions: Africana
Depositing User: Selom Ghislain
Date Deposited: 01 Nov 2018 09:16
Last Modified: 01 Nov 2018 09:16
URI: http://thesisbank.jhia.ac.ke/id/eprint/7175

Actions (login required)

View Item View Item