Effectiveness of Content-Based Image Clustering Algorithms

Ambaye, Mesfin Sileshi (2007) Effectiveness of Content-Based Image Clustering Algorithms. Masters thesis, Addis Ababa University.

[img] PDF (Effectiveness of Content-Based Image Clustering Algorithms)
Mesfin, Sileshi Ambaye.pdf - Accepted Version
Restricted to Repository staff only

Download (522kB) | Request a copy


Retrieval of a set of similar image documents requires clustering the images based on their similar features. Clustered images are utilized by Content-Based Image Retrieval (CBIR) and querying system that requires effective query matching in large image databases. Contentbased image clustering provides a more efficient method of management and retrieval of large number of images documents. The Content-based image clustering facilitates users to browse through only a particular subset of related image documents in an efficient manner. This study focus in validating the two commonly image clustering algorithms namely: hierarchical and k-means. The validation is based on a set of selected MPEG-7 image feature descriptors. The similarity measure input to these clustering algorithms considers both quantitative and predicate-based similarity measures. We computed two similarity measures total color-based similarity matrix as a weighted sum of the MPEG-7 color descriptors and total similarity matrix as a weighted sum of color, texture and shape features. The proposed metric to measure the effectiveness of clustering subsets of COREL color photo images is with respect to their semantic meaning. Shannon’s information theory is selected in the measuring the image’s cluster cohesiveness. The clusters formed are said to be well separated when the distinct clusters formed are associated to a specific image semantic. The separation among clusters becomes better when the semantic association of images to a cluster is predictable. The intra-cluster cohesiveness is also captured by the Shannon’s entropy measure in measuring the clusters separation. The best quality clusters are formed by the hierarchical method that uses the average-linkage method when the same total color similarity matrix is input to all clustering algorithms. Experimental result shows that the quality of clusters formed by k-means clustering is not better than any of the three hierarchical methods. Hierarchical method which uses averagelinkage produced quality of clusters three times better as compared to k-means. Even though weighted texture and shape similarity measures were used in addition to total color the average HACM is the best method compared to both the k-means in the formation of both semantic and cluster cohesive clusters. The other different result obtained is that the addition of texture and shape feature degrades cluster quality for all hierarchical methods.

Item Type: Thesis (Masters)
Subjects: Q Science > Q Science (General)
Q Science > QA Mathematics
Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Divisions: Africana
Depositing User: Selom Ghislain
Date Deposited: 27 Jun 2018 13:56
Last Modified: 27 Jun 2018 13:56
URI: http://thesisbank.jhia.ac.ke/id/eprint/6050

Actions (login required)

View Item View Item