Abstract
Credit card fraud has been on the rise for some years now after the introduction of card payment systems. To curb this menace, computational methods have been proposed. Unfortunately, the data available for such a study is highly skewed resulting in the data imbalance problem. In this study, the authors investigate the performance of some selected data imbalance algorithms employed in the prediction of credit card fraud. A dataset from Kaggle containing 284,315 genuine transactions and 492 fraudulent transactions was used for the evaluation. The machine learning algorithms deployed for the study is logistic regression, naïve bayes, and the k-nearest neighbour algorithm with F1 score and precision-recall area under the curve (PR AUC) as the metric. Numerical assessment of the performance of the adopted algorithm gave a rate of 82.5% and 81%, respectively, using neighbourhood cleaning rule for undersampling.
| Original language | English |
|---|---|
| Journal | International Journal of Intelligent Information Technologies |
| Volume | 17 |
| Issue number | 4 |
| DOIs | |
| Publication status | Published - 1 Oct 2021 |
Keywords
- Credit Card
- Fraud Data
- Logistic Regression
- Machine Learning
- Resampling
Fingerprint
Dive into the research topics of 'Evaluation of data imbalance algorithms on the prediction of credit card fraud'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver