Knowledge Discovery for Effective Customer Segmentation: The Case of Ethiopian Revenue and Customs Authority

Belete, Biazen (2011) Knowledge Discovery for Effective Customer Segmentation: The Case of Ethiopian Revenue and Customs Authority. Masters thesis, Addis Ababa University.

[img] PDF (Knowledge Discovery for Effective Customer Segmentation: The Case of Ethiopian Revenue and Customs Authority)
Belete, Biazen.pdf - Accepted Version
Restricted to Repository staff only

Download (1MB) | Request a copy

Abstract

CRM is a process by which an organization maximizes customer satisfaction in an effort to increase loyalty and retain customers’ business over their lifetimes. On the other hand, customer segmentation is the grouping of customers into different groups based on their common attributes and it is the main part of CRM. In order to analyze CRM data, one needs to explore the data from different angles and look at its different aspects. This should require application of different types of data mining techniques. Data mining finds and extracts knowledge hidden in corporate data warehouses. The aim of this study is to test the applicability of clustering and classification data mining techniques to support CRM activities for ERCA using the Cios et al. (2000) KDD process model. In this study, different characteristics of the ERCA customers’ data were collected from the customs ASYCUDA database. Once the customers’ data were collected, the necessary data preparation steps were conducted on it and finally a dataset consisting of 46748 records was attained. To segment customers, the K-means clustering algorithm was used. During the cluster modeling different experiments have been conducted using different cluster numbers (K=3, 4, 5, 6) and seed values. From the different experiments, the one which had better performance has been selected. Hence, the cluster model at K=5 had better performance and its output was used for the next classification modeling. The classification modeling was built by using J48 decision tree and multilayerperceptron ANN algorithms with 10-fold cross-validation and splitting (70% training and 30% testing) techniques. Among these models, a model which was built using J48 decision tree algorithm with default 10-fold cross-validation shows better performance which is 99.95% of overall accuracy rate; hence this model was selected. The results of this research were encouraging as very high classification accuracy has been obtained.

Item Type: Thesis (Masters)
Subjects: H Social Sciences > H Social Sciences (General)
Z Bibliography. Library Science. Information Resources > Z665 Library Science. Information Science
Z Bibliography. Library Science. Information Resources > ZA Information resources
Z Bibliography. Library Science. Information Resources > ZA Information resources > ZA4050 Electronic information resources
Divisions: Africana
Depositing User: Selom Ghislain
Date Deposited: 11 Sep 2018 12:57
Last Modified: 11 Sep 2018 12:57
URI: http://thesisbank.jhia.ac.ke/id/eprint/5217

Actions (login required)

View Item View Item