Evaluation of data imbalance algorithms on the prediction of credit card fraud

Godlove Otoo, Justice Kwame Appati, Winfred Yaokumah, Michael Agbo Tettey Soli, Stephane Jnr Nwolley, Julius Yaw Ludu

Research output: Contribution to journalArticlepeer-review

Abstract

Credit card fraud has been on the rise for some years now after the introduction of card payment systems. To curb this menace, computational methods have been proposed. Unfortunately, the data available for such a study is highly skewed resulting in the data imbalance problem. In this study, the authors investigate the performance of some selected data imbalance algorithms employed in the prediction of credit card fraud. A dataset from Kaggle containing 284,315 genuine transactions and 492 fraudulent transactions was used for the evaluation. The machine learning algorithms deployed for the study is logistic regression, naïve bayes, and the k-nearest neighbour algorithm with F1 score and precision-recall area under the curve (PR AUC) as the metric. Numerical assessment of the performance of the adopted algorithm gave a rate of 82.5% and 81%, respectively, using neighbourhood cleaning rule for undersampling.

Original languageEnglish
JournalInternational Journal of Intelligent Information Technologies
Volume17
Issue number4
DOIs
Publication statusPublished - 1 Oct 2021

Keywords

  • Credit Card
  • Fraud Data
  • Logistic Regression
  • Machine Learning
  • Resampling

Fingerprint

Dive into the research topics of 'Evaluation of data imbalance algorithms on the prediction of credit card fraud'. Together they form a unique fingerprint.

Cite this