the datasets and to maximize the performance of the models, including SMOTE to oversample (Rahmadani et
al., 2025; Asnawi et al., 2025) and GridSearchCV to optimize the model (Rahmadani et al., 2025).
These findings are further contextualized by other works: Purwar et al. (2023) focused on the imbalance in the
management of the datasets, Kabane (2024) focused on the impact of sampling methods and data leaks, and
Niu et al. (2019) focused on the superiority of XGBoost over other models. All in all, the analysis highlights
that the proposed model is competitive in its performance, as it closely or even surpasses the performance of
the existing approaches and is also quite robust and reliable in identifying credit card fraud.
CONCLUSION
This paper came up with a credit card fraud detection system that utilizes machine learning through XGBoost.
The Kaggle Credit Card Fraud Detection dataset was used to collect transactional data, publicly available,
which contains 284,807 transactions of which only 492 are fraudulent. The data were preprocessed to enhance
the learning of the model by means of normalization of numerical variables, calculation of missing values, and
PCA on anonymized variables (V128) and to minimize dimensionality and preserve important patterns of
transactions. The imbalance of classes was overcome with the help of such methods as scale_pos_weight, so
that the model was capable of effectively learning both legitimate and fraudulent transactions.
XGBoost model was hyperparameter tuned and trained and tested on stratified train-test splits. Accuracy,
precision, recall, F1-score, and ROC-AUC performance metrics were found to be highly predictive. After 20
boosting rounds, the model achieved a validation accuracy of 94.9%, precision of 92.8%, recall of 90.5%, and
ROC-AUC of 94.7%. These findings demonstrate that the model has a high ability to identify fraudulent
transactions with minimal EFT, and the learning curves are robust convergence and generalizability of the
model on unknown data. The comparative analysis to the previous works also proved that the presented
method is competitive and has the advantage of strong preprocessing and the use of PCA features reduction.
To sum up, the paper has shown that XGBoost-based fraud detection system that is assisted by attentive
preprocessing and dimensionality reduction is a credible and viable approach to real-world financial fraud
detection. The system is able to work well with unbalanced data, provide high prediction accuracy and can be
incorporated into transaction pipelines running in real-time to detect fraudulent activities immediately. These
results suggest the importance of the integration of advanced machine learning and effective data engineering
to improve financial security and operational efficiency.
REFERENCES
1. Al Ali, A., Alazab, M., & Khan, S. (2023). A hybrid deep learning model for financial fraud detection
using blockchain and ensemble methods. Computers, 12(3), 78.
https://doi.org/10.3390/computers12030078
2. Alazab, M., Tang, M., & Alazab, M. (2021). Deep learning for cybersecurity and fraud detection in
financial transactions. Electronics, 10(5), 593. https://doi.org/10.3390/electronics10050593
3. Asnawi, M. F., & Zacky, M. (2025). The application of XGBoost classification for credit card fraud
detection using SMOTE. Journal of Computer Science and Engineering Technology, 15(2), 92–104.
https://journal.nacreva.com/index.php/cest/article/view/131
4. Deng, Y., Zhang, H., & Li, X. (2025). Ensemble learning for fraud detection in imbalanced financial
datasets. Journal of Intelligent & Fuzzy Systems, 39(1), 115–126. https://doi.org/10.3233/JIFS-230456
5. Kabane, S. (2024). Impact of sampling techniques and data leakage on XGBoost performance in credit
card fraud detection. arXiv Preprint, arXiv:2412.07437. https://arxiv.org/abs/2412.07437
6. Kumar, A., Sharma, R., & Singh, P. (2023). Explainable AI for financial fraud detection using XGBoost
and SHAP. Journal of Intelligent Systems, 32(1), 45–58. https://doi.org/10.1515/jisys-2022-0034
7. Kumar, R., & Singh, A. (2022). Credit card fraud detection using XGBoost and ensemble learning.
International Journal of Information Technology, 14(3), 567–574. https://doi.org/10.1007/s41870-021-
00791-4
8. Niu, X., Wang, L., & Yang, X. (2019). A comparison study of credit card fraud detection: Supervised
versus unsupervised. arXiv Preprint, arXiv:1904.10604. https://arxiv.org/abs/1904.10604