Comparison of Machine Learning Algorithms for Customer Churn Prediction in ISP
DOI:
https://doi.org/10.59169/pentaciencias.v8i2.1831Keywords:
customer churn; machine learning; random forest; support vector machine; customer retentionAbstract
The prediction of customer churn constitutes a strategic challenge for Internet service providers due to its impact on user retention and business sustainability. This study compared the performance of machine learning algorithms for churn prediction using the public dataset Internet Service Customer Churn. The research was conducted using the CRISP-DM methodology and included exploratory analysis, imputation of missing values using KNNImputer, and training of models based on Decision Trees, Random Forest, Support Vector Machine, and Multilayer Perceptron Neural Network. Likewise, the effect of the SMOTE, ADASYN, and Borderline-SMOTE balancing techniques on predictive performance was evaluated. The results showed that Random Forest without balancing achieved the best performance on the original dataset, with an accuracy of 91.16%, precision of 94.13%, recall of 89.64%, F1 score of 91.83%, and ROC-AUC of 97.09%. Moreover, the balancing techniques did not generate consistent improvements in all evaluated scenarios, demonstrating that their effectiveness depends on the characteristics of the data and the algorithm used. These findings provide empirical evidence on the comparative effectiveness of the evaluated models and offer a reproducible methodological reference for future research.
Downloads
References
Arshimny, F. Z. & Adiwijaya. (2024). Performance Analysis of Random Forest Algorithm for Customer Churn Prediction in the Telecommunications Sector. 2024 International Conference on Intelligent Cybernetics Technology and Applications, ICICyTA 2024, 1262-1267. https://doi.org/10.1109/ICICYTA64807.2024.10912859
Barsotti, A., Gianini, G., Mio, C., Lin, J., Babbar, H., Singh, A., Taher, F., & Damiani, E. (2024). A Decade of Churn Prediction Techniques in the TelCo Domain: A Survey. SN Computer Science, 5(4), 1-15. https://doi.org/10.1007/S42979-024-02722-7/TABLES/3
Chang, V., Hall, K., Xu, Q. A., Amao, F. O., Ganatra, M. A., & Benson, V. (2024). Prediction of Customer Churn Behavior in the Telecommunication Industry Using Machine Learning Models. Algorithms 2024, Vol. 17, Page 231, 17(6), 231. https://doi.org/10.3390/A17060231
Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic Minority Over-sampling Technique. Journal of Artificial Intelligence Research, 16, 321-357. https://doi.org/10.1613/JAIR.953
Das, D., & Mahendher, D. S. (2024). Comparative Analysis Of Machine Learning Approaches In Predicting Telecom Customer Churn. Educational Administration: Theory and Practice, 30(5), 8185-8199. https://doi.org/10.53555/kuey.v30i5.4348
Edwine, N., Wang, W., Song, W., & Ssebuggwawo, D. (2022). Detecting the Risk of Customer Churn in Telecom Sector: A Comparative Study. Mathematical Problems in Engineering, 2022(1), 8534739. https://doi.org/10.1155/2022/8534739
Googerdchi, K. F., Asadi, S., & Jafari, S. M. (2024). Customer churn modeling in telecommunication using a novel multi-objective evolutionary clustering-based ensemble learning. PLOS ONE, 19(6), e0303881. https://doi.org/10.1371/JOURNAL.PONE.0303881
Han, H., Wang, W.-Y., & Mao, B.-H. (2005). Borderline-SMOTE: A New Over-Sampling Method in Imbalanced Data Sets Learning. LNCS, 3644, 878-887. https://doi.org/10.1007/11538059_91
He, H., Bai, Y., Garcia, E. A., & Li, S. (2008). ADASYN: Adaptive synthetic sampling approach for imbalanced learning. 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence), 1322-1328. https://doi.org/10.1109/IJCNN.2008.4633969
Jain, H., Khunteta, A., & Shrivastav, S. P. (2021). Telecom Churn Prediction Using Seven Machine Learning Experiments integrating Features engineering and Normalization. https://doi.org/10.21203/RS.3.RS-239201/V1
Kunt, M., Sabri. (2021). Internet Service Provider Customer Churn. https://www.kaggle.com/datasets/mehmetsabrikunt/internet-service-churn
Mendoza, K., Hurtado, J., Morocho, R., & Rivas, W. (2025). Análisis de Sentimiento y Clasificación de Texto para la Detección Automática de Acosos y Amenazas Mediante Inteligencia Artificial. Informática y Sistemas, 9(1), 82-92. https://doi.org/10.33936/ISRTIC.V9I1.7470
Mishra, A., & Reddy, U. S. (2018). A comparative study of customer churn prediction in telecom industry using ensemble based classifiers. Proceedings of the International Conference on Inventive Computing and Informatics, ICICI 2017, 721-725. https://doi.org/10.1109/ICICI.2017.8365230
Montesdeoca Espinoza, L. J., Zambrano Rojas, S. J., Pinargote Bravo, V. J., & Cedeño Valarezo, L. C. (2025). Balanceo de Conjuntos de Datos Basado en Redes Generativas Aplicado a Imágenes del Sector Agrícola. Informática y Sistemas, 9(2), 164-176. https://doi.org/10.33936/ISRTIC.V9I2.7782
Nhu, N. Y., Van Ly, T., & Truong Son, D. V. (2022). Churn prediction in telecommunication industry using kernel Support Vector Machines. PLOS ONE, 17(5), e0267935. https://doi.org/10.1371/JOURNAL.PONE.0267935
Nurtriana, A., Rachmawati, D. D., Artiyasa, M., & Sidiq, D. S. Z. (2024). Churn prediction analysis of telecom customers using svm, random forest and logistic regression models using orange data mining tools. E3S Web of Conferences, 501, 02012. https://doi.org/10.1051/E3SCONF/202450102012
Plotnikova, V., Dumas, M., & Milani, F. (2020). Adaptations of data mining methodologies: A systematic literature review. PeerJ Computer Science, 6, e267. https://doi.org/10.7717/peerj-cs.267
Poudel, S. S., Pokharel, S., & Timilsina, M. (2024). Explaining customer churn prediction in telecom industry using tabular machine learning models. Machine Learning with Applications, 17, 100567. https://doi.org/10.1016/J.MLWA.2024.100567
Shearer, C. (2000). The CRISP-DM Model: The New Blueprint for Data Mining. Journal of data warehousing, 5.
Sikri, A., Jameel, R., Idrees, S. M., & Kaur, H. (2024). Enhancing customer retention in telecom industry with machine learning driven churn prediction. Scientific Reports, 14(1), 1-13. https://doi.org/10.1038/S41598-024-63750-0
Wagh, S. K., Andhale, A. A., Wagh, K. S., Pansare, J. R., Ambadekar, S. P., & Gawande, S. H. (2024). Customer churn prediction in telecom sector using machine learning techniques. Results in Control and Optimization, 14, 100342. https://doi.org/10.1016/J.RICO.2023.100342
Wirth, R., & Hipp, J. (2000). CRISP-DM: Towards a Standard Process Model for Data Mining.
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Revista Científica Arbitrada Multidisciplinaria PENTACIENCIAS - ISSN 2806-5794.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

