TY - JOUR
T1 - Advanced analysis of soil pollution in southwestern Ghana using Variational Autoencoders (VAE) and positive matrix factorization (PMF)
AU - Kazapoe, Raymond Webrah
AU - Kwayisi, Daniel
AU - Alidu, Seidu
AU - Sagoe, Samuel Dzidefo
AU - Umaru, Aliyu Ohiani
AU - Amuah, Ebenezer Ebo Yahans
AU - Addai, Millicent Obeng
AU - Fynn, Obed Fiifi
N1 - Publisher Copyright:
© 2025
PY - 2025/6
Y1 - 2025/6
N2 - The study combined the Positive Matrix Factorization (PMF) receptor model with the Variational Autoencoders (VAE) Machine Learning technique and ecological risk indices to study the spatial distribution, sources and patterns of soil pollution in the study area. 719 soil samples were analysed for selected Potentially Toxic Elements (PTEs) concentrations. As (9.68 mg/L), and Pb (7.43 mg/L) reported elevated levels across the area linked to mining activities. The PTEs displayed a decreasing trend in the order Ba > Cr > V > Zn > Cu > Ni > As > Pb > Co. The Pearson correlation matrix outlines two main groups of PTEs: (1) moderate correlation (Ba, Cr, Cu, Ni and V) and (2) weak correlation (As, Pb and Zn). These relationships are corroborated by the VAE, which outlined a low contribution by As and a high contribution by V to all the latent dimensions. The PMF revealed three factors: Factor 1 (geogenic): Ba (77.5%), Cu (54.4%), Ni (66.4%), V (54.0) and Cr (46.8%). Factor 2 (mixed) Co (61.6%), Pb (64.8%) and Zn (71.0%). Factor 3 (anthropogenic) As (86.7%). The degree of contamination analysis depicts that 69.03% of the samples are moderately polluted, while 15.14% and 0.28% revealed considerable and very high pollution, respectively. The pollution load index shows that 20% of the samples depict the existence of pollution. The Potential Ecological Risk Index (RI) values showed that most samples (97.08%) suggest low pollution, while 2.92% depict moderate pollution. Integrating chemometric and machine learning techniques provides a dynamic system that can monitor pollution shifts early, to aid remediation efforts in highly affected areas.
AB - The study combined the Positive Matrix Factorization (PMF) receptor model with the Variational Autoencoders (VAE) Machine Learning technique and ecological risk indices to study the spatial distribution, sources and patterns of soil pollution in the study area. 719 soil samples were analysed for selected Potentially Toxic Elements (PTEs) concentrations. As (9.68 mg/L), and Pb (7.43 mg/L) reported elevated levels across the area linked to mining activities. The PTEs displayed a decreasing trend in the order Ba > Cr > V > Zn > Cu > Ni > As > Pb > Co. The Pearson correlation matrix outlines two main groups of PTEs: (1) moderate correlation (Ba, Cr, Cu, Ni and V) and (2) weak correlation (As, Pb and Zn). These relationships are corroborated by the VAE, which outlined a low contribution by As and a high contribution by V to all the latent dimensions. The PMF revealed three factors: Factor 1 (geogenic): Ba (77.5%), Cu (54.4%), Ni (66.4%), V (54.0) and Cr (46.8%). Factor 2 (mixed) Co (61.6%), Pb (64.8%) and Zn (71.0%). Factor 3 (anthropogenic) As (86.7%). The degree of contamination analysis depicts that 69.03% of the samples are moderately polluted, while 15.14% and 0.28% revealed considerable and very high pollution, respectively. The pollution load index shows that 20% of the samples depict the existence of pollution. The Potential Ecological Risk Index (RI) values showed that most samples (97.08%) suggest low pollution, while 2.92% depict moderate pollution. Integrating chemometric and machine learning techniques provides a dynamic system that can monitor pollution shifts early, to aid remediation efforts in highly affected areas.
KW - Data reduction
KW - Environmental degradation
KW - Galamsey
KW - Gold mining
KW - Toxicity
UR - http://www.scopus.com/inward/record.url?scp=85217485382&partnerID=8YFLogxK
U2 - 10.1016/j.indic.2025.100627
DO - 10.1016/j.indic.2025.100627
M3 - Article
AN - SCOPUS:85217485382
SN - 2665-9727
VL - 26
JO - Environmental and Sustainability Indicators
JF - Environmental and Sustainability Indicators
M1 - 100627
ER -