Skip to main navigation Skip to search Skip to main content

Engineering an interpretable chemometric pipeline for sugarcane wine fermentation using synthetic spectra and explainable ensembles

  • Ebenezer Aquisman Asare
  • , Dickson Abdul-Wahab
  • , Elsie Effah Kaufmann
  • , Rafeah Wahi
  • , Zainab Ngaini
  • , Archibold Buah-Kwofie
  • , Abdul Rashid Dickson
  • Northeastern University
  • The Council for Scientific and Industrial Research
  • University of Ghana
  • Universiti Malaysia Sarawak
  • Ghana Atomic Energy Commission
  • Northeastern University

Research output: Contribution to journalArticlepeer-review

Abstract

Background: Real-time spectroscopic monitoring of sugarcane wine is becoming increasingly important for quality and safety control. However, it remains less studied than established fermentations such as beer or grape wine, especially from a chemometric perspective. Objective: To develop a mechanistically grounded synthetic spectral generation framework that enables interpretable machine learning for fermentation monitoring without extensive experimental calibration datasets. Methods: We integrated kinetic fermentation modelling based on extended Monod-inhibition equations with realistic spectral simulation incorporating multi-scale noise artefacts. Nine machine learning architectures (PLS, Random Forest, Gradient Boosting, DNN, CNN, LSTM, ResNet, Transformer, and Stacked Ensemble) were evaluated using SHAP-based explainability analysis and Bayesian bootstrap uncertainty quantification. Results: Tree-based models achieved exceptional performance on purely synthetic validation data (R2 up to 0.997, RMSE ≈ 1.1 g/L), but all architectures collapsed when evaluated under simulated real-world conditions that introduced unmodelled matrix variability and instrument artefacts (R2 from −0.01 to −1.88). The simplest PLS model degraded the least but still failed to reach acceptable predictive accuracy, indicating a fundamental gap between the synthetic training distribution and realistic deployment scenarios. Significance: These results show that even careful mechanistic, noise-aware synthetic spectra cannot guarantee successful domain transfer by themselves. Synthetic data remain valuable for model prototyping, architecture screening, and interpretability analysis, but must be complemented by targeted experimental calibration and synthetic-to-real adaptation strategies. Exploiting this limitation is critical for chemometric practice, as it reframes synthetic pipelines from “replacements” to “augmentation tools” for real fermentation monitoring.

Original languageEnglish
Article number105742
JournalChemometrics and Intelligent Laboratory Systems
Volume274
DOIs
Publication statusPublished - 15 Jul 2026

Fingerprint

Dive into the research topics of 'Engineering an interpretable chemometric pipeline for sugarcane wine fermentation using synthetic spectra and explainable ensembles'. Together they form a unique fingerprint.

Cite this