Abstract
This study evaluates the performance of some commonly used chemometric and machine learning techniques such as principal component analysis (PCA), artificial neural network (ANN), k-nearest neighbors (KNN), logistic regression discriminant analysis (LRDA), partial least squares discriminant analysis (PLSDA), support vector machine (SVM), and gradient boosted decision tree (GBDT) on HATR − FTIR data for detecting Sudan dye adulteration in palm oil. We employed the Icoshift for data alignment and Savitzky-Golay smoothing to enhance the data quality. Cluster resolution feature selection (CRFS) selected 2.39 % of 3351 features. Using only the 80 selected features PCA models showed a clear separation between adulterated and pure palm oil samples and an improvement in explained variance which hitherto was not observed. LRDA, PLSDA and SVM showed improved training TPR, ACC and MCC after feature selection. KNN showed improvement all model quality parameters after feature selection.
Original language | English |
---|---|
Article number | 112433 |
Journal | Microchemical Journal |
Volume | 208 |
DOIs | |
Publication status | Published - Jan 2025 |
Keywords
- Adulteration
- Chemometrics
- Feature Selection
- FTIR
- Machine Learning
- Palm Oil
- Sudan Dyes