TY - JOUR
T1 - A predictive model, and predictors of under-five child malaria prevalence in Ghana
T2 - How do LASSO, Ridge and Elastic net regression approaches compare?
AU - Aheto, Justice Moses K.
AU - Duah, Henry Ofori
AU - Agbadi, Pascal
AU - Nakua, Emmanuel Kweku
N1 - Publisher Copyright:
© 2021 The Author(s)
PY - 2021/9
Y1 - 2021/9
N2 - Malaria is among the leading causes of mortality and morbidity among children in Ghana. Therefore, identifying the predictors of malaria prevalence in children under-five is among the priorities of the global health agenda. In Ghana, the paradigm shifts from using traditional statistics to machine learning techniques to identifying predictors of malaria prevalence are scarce. Thus, the present study used machine learning techniques to identify variables to build the best fitting predictive model of malaria prevalence in Ghana. We analysed the data on 2867 under-five children with malaria RDT results from the 2019 Ghana Malaria Indicator Survey. LASSO, Ridge, and Elastic Net regression methods were used to select variables to build predictive models. The R freeware version 4.0.2 was used. One out of four children tested positive for malaria (25.04%). The logit models based on selected features by LASSO, Ridge, and Elastic Net contained eleven, fifteen, and thirteen features, respectively. The LASSO regression model is preferred because it contains the smallest number of predictors and the smallest prediction error. The significant predictors of malaria among children were being older than 24 months, residing in the poorest household, being severely anaemic, residing in households without electricity, and residing in a rural area. The predictors identified in our study deserve policy attention and interventions to strengthen malaria control efforts in Ghana. The machine learning techniques employed in our study, especially the LASSO regression technique could be beneficial for identifying predictors of malaria prevalence in this group of children.
AB - Malaria is among the leading causes of mortality and morbidity among children in Ghana. Therefore, identifying the predictors of malaria prevalence in children under-five is among the priorities of the global health agenda. In Ghana, the paradigm shifts from using traditional statistics to machine learning techniques to identifying predictors of malaria prevalence are scarce. Thus, the present study used machine learning techniques to identify variables to build the best fitting predictive model of malaria prevalence in Ghana. We analysed the data on 2867 under-five children with malaria RDT results from the 2019 Ghana Malaria Indicator Survey. LASSO, Ridge, and Elastic Net regression methods were used to select variables to build predictive models. The R freeware version 4.0.2 was used. One out of four children tested positive for malaria (25.04%). The logit models based on selected features by LASSO, Ridge, and Elastic Net contained eleven, fifteen, and thirteen features, respectively. The LASSO regression model is preferred because it contains the smallest number of predictors and the smallest prediction error. The significant predictors of malaria among children were being older than 24 months, residing in the poorest household, being severely anaemic, residing in households without electricity, and residing in a rural area. The predictors identified in our study deserve policy attention and interventions to strengthen malaria control efforts in Ghana. The machine learning techniques employed in our study, especially the LASSO regression technique could be beneficial for identifying predictors of malaria prevalence in this group of children.
KW - Elastic net
KW - Ghana
KW - LASSO
KW - Malaria
KW - RIDGE
UR - http://www.scopus.com/inward/record.url?scp=85108877703&partnerID=8YFLogxK
U2 - 10.1016/j.pmedr.2021.101475
DO - 10.1016/j.pmedr.2021.101475
M3 - Article
AN - SCOPUS:85108877703
SN - 2211-3355
VL - 23
JO - Preventive Medicine Reports
JF - Preventive Medicine Reports
M1 - 101475
ER -