Logistic Regression and Linear Discriminant Analysis in the Evaluation of Factors Associated with Stunting in Children: Divergence and Similarity of the Statistical Methods

Rutunga, L. (2014) Logistic Regression and Linear Discriminant Analysis in the Evaluation of Factors Associated with Stunting in Children: Divergence and Similarity of the Statistical Methods. Masters thesis, University of Zimbabwe.

[img] PDF (Logistic Regression and Linear Discriminant Analysis in the Evaluation of Factors Associated with Stunting in Children: Divergence and Similarity of the Statistical Methods)
Rutunga, L..pdf - Accepted Version
Restricted to Repository staff only

Download (1MB) | Request a copy

Abstract

Background: Stunting is a well-established child health indicator of chronic malnutrition which is associated with biological, environmental and socioeconomic factors. Logistic regression and linear discriminant analysis are two statistical methods that can be used to predict or classify subjects as either stunted or not stunted based on all or a subset of measured predictor variables. The predictive accuracy of the two methods were compared with respect to several attributes of each of the methods. Methods: Data used for the study was extracted from the Zvitambo trial data set. The multivariable logistic regression and linear discriminant models were fitted using 20 bootstrap samples for cross validation of the coefficients. The two models were compared with respect to the variables selected, the sign and magnitude of the coefficients, sensitivity, specificity, overall classification rate and areas under ROC curves. The two methods were applied in combination to check if predictive accuracy would improve. Results: Logistic regression and linear discriminant analysis had the same predictive accuracy with classification rates of 78.76% and 78.86% respectively. Both methods identified two common factors, sex and birth weight, and the coefficients of the two factors had the same negative sign but the magnitude differed significantly, both had low sensitivity (13.19% and 8.68%) and high specificity (97.44% and 98.24%). Combining the two methods did not improve predictive accuracy (71.5% before and 70.24% after). Conclusion: The two multivariable techniques tend to converge in classification accuracy mainly when the sample size is large (>50) but when faced with making a choice between the two, it is recommended to use the method whose assumptions for application are fulfilled.

Item Type: Thesis (Masters)
Subjects: H Social Sciences > HA Statistics
R Medicine > RJ Pediatrics > RJ101 Child Health. Child health services
R Medicine > RZ Other systems of medicine
Divisions: Africana
Depositing User: Tim Khabala
Date Deposited: 24 Apr 2018 12:53
Last Modified: 24 Apr 2018 12:53
URI: http://thesisbank.jhia.ac.ke/id/eprint/3860

Actions (login required)

View Item View Item