Automatic Ontology Learning from Unstructured Amharic Text

Berhanu, Mengiste (2013) Automatic Ontology Learning from Unstructured Amharic Text. Masters thesis, Addis Ababa University.

[img] PDF (Automatic Ontology Learning from Unstructured Amharic Text)
Berhanu, Megiste.pdf - Accepted Version
Restricted to Repository staff only

Download (11MB) | Request a copy

Abstract

This research proposes a method, Amharic ontology learner, which helps to automatically learn or extract ontology from an unstructured Amharic text. Amharic ontology learner handles the ontology learning process through distinct process layers, concept extraction, taxonomy building, and nontaxonomic relations mining. Once all potential concepts are extracted a concept hierarchy (taxonomy) is formed, which is then supplemented by non-taxonomic relations to evolve the taxonomy into a full ontology. Different methods have been used to implement each layer. Amharic ontology learner is based on both single-word and multi-word concepts, as these make the ontology to be represented by a more solid and distinctive concepts. A hierarchical agglomerative clustering method is used for building the domain taxonomy. To identify the non-taxonomic relations a linguistic method, verbal expressions as a relation indicator, is used and a method which tries to find out the most appropriate level of generalization for the relation is also implemented at the top of the non-taxonomic relation mining module. To practically test the performance of the methods, modules in Amharic ontology learner are implemented. Our method can also represent the extracted ontology in OWL using Jena Semantic Web Framework. Amharic ontology learner is applied to an already tagged news corpus from WALTA News Agency. The result shows that Amharic ontology learner can be used as a starting point for future researches related to Ontologies and Ontology learning from Amharic text.

Item Type: Thesis (Masters)
Uncontrolled Keywords: Ontology, Ontology learning, Concept, taxonomy, Concept relationship
Subjects: P Language and Literature > PL Languages and literatures of Eastern Asia, Africa, Oceania
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
T Technology > T Technology (General)
Divisions: Africana
Depositing User: Selom Ghislain
Date Deposited: 20 Sep 2018 13:45
Last Modified: 20 Sep 2018 13:45
URI: http://thesisbank.jhia.ac.ke/id/eprint/5308

Actions (login required)

View Item View Item