Empowering Diagnostics: An Ensemble Machine Learning Model for Early Liver Disease Detection
DOI:
https://doi.org/10.58564/IJSER.4.2.2025.314Keywords:
Telemedicine, Artificial Intelligence (AI), Machine Learning, Stacking Classifier, Liver DiseasesAbstract
Early and accurate detection of liver disease is critical to improving patient outcomes yet remains challenging due to class imbalance and noisy clinical data. In this study, we present a robust ensemble learning framework applied to the Indian Liver Patient Dataset, incorporating systematic data cleaning, normalization, and Synthetic Minority Over‑Sampling (SMOTE) to address missing values, outliers, and class skew. We then perform correlation-based feature reduction before training a stacking classifier that combines Random Forest, XGBoost, and ExtraTrees base learners with an ExtraTrees meta‑learner. Using stratified 10‑fold cross‑validation on the balanced cohort (n = 792), our ensemble achieves 91.6 % accuracy, 92 % F1‑score, and a high area under the ROC curve, outperforming individual models and prior published approaches. These results demonstrate the potential of heterogeneous ensembles for clinical decision support in hepatology and lay the groundwork for prospective validation in diverse patient populations.
References
[1] R. Amin, R. Yasmin, S. Ruhi, M. H. Rahman, and M. S. Reza, “Prediction of chronic liver disease patients using integrated projection based statistical feature extraction with machine learning algorithms,” Inform Med Unlocked, vol. 36, p. 101155, Jan. 2023, doi: 10.1016/J.IMU.2022.101155.
[2] H. R. Singh and S. Rabi, “Study of morphological variations of liver in human,” Translational Research in Anatomy, vol. 14, pp. 1–5, Mar. 2019, doi: 10.1016/J.TRIA.2018.11.004.
[3] M. P. Manns et al., “Hepatitis C virus infection,” Nature Reviews Disease Primers 2017 3:1, vol. 3, no. 1, pp. 1–19, Mar. 2017, doi: 10.1038/nrdp.2017.6.
[4] A. Ahmed Jasim, L. Rafea Hazim, H. Alwindawi, and O. Ata, “Optimizing Prediction of Cardiac Conditions Using Hyper-Adaboost-Integrated Machine Learning Models,” Journal for Scientific Engineering Research, vol. 3, no. 3, 2024, doi: 10.58564/IJSER.3.3.2024.220.
[5] M. F. Yuen et al., “Hepatitis B virus infection,” Nature Reviews Disease Primers 2018 4:1, vol. 4, no. 1, pp. 1–20, Jun. 2018, doi: 10.1038/nrdp.2018.35.
[6] H. Mohammedqasim, A. A. Jasim, R. Mohammedqasem, and B. A. Ozturk, “Advancing Parkinson’s Disease Detection: Integrating Machine Learning with Enhanced Feature Selection and Data Augmentation,” Lecture Notes in Networks and Systems, vol. 1085 LNNS, pp. 451–465, 2024, doi: 10.1007/978-981-97-6726-7_36.
[7] A. A. Jasim, O. Ata, and O. H. Salman, “AI-Driven Triage: A Graph Neural Network Approach for Prehospital Emergency Triage Patients in IoMT-Based Telemedicine Systems,” 2024 International Symposium on Electronics and Telecommunications (ISETC), pp. 1–7, Nov. 2024, doi: 10.1109/ISETC63109.2024.10797314.
[8] M. S. Khuroo, “Discovery of Hepatitis E and Its Impact on Global Health: A Journey of 44 Years about an Incredible Human-Interest Story,” Viruses 2023, Vol. 15, Page 1745, vol. 15, no. 8, p. 1745, Aug. 2023, doi: 10.3390/V15081745.
[9] F. Idalsoaga, A. V. Kulkarni, O. Y. Mousa, M. Arrese, and J. P. Arab, “Non-alcoholic Fatty Liver Disease and Alcohol-Related Liver Disease: Two Intertwined Entities,” Front Med (Lausanne), vol. 7, p. 556724, Aug. 2020, doi: 10.3389/FMED.2020.00448/BIBTEX.
[10] “Kashmir in Sickness and in Health - Gulzar Mufti - Google Books.” Accessed: Jan. 20, 2024. [Online]. Available: https://books.google.com.tr/books?hl=en&lr=&id=AoAcAgAAQBAJ&oi=fnd&pg=PT4&dq=Liver+diseases+in+India:+hope+and+despair.+Greater+Kashmir&ots=6zwOqEOS0i&sig=6yfuRFLWkXsysU-96cqwTW9GyVM&redir_esc=y#v=onepage&q&f=false
[11] J. H. Joloudari, H. Saadatfar, A. Dehzangi, and S. Shamshirband, “Computer-aided decision-making for predicting liver disease using PSO-based optimized SVM with feature selection,” Inform Med Unlocked, vol. 17, p. 100255, Jan. 2019, doi: 10.1016/J.IMU.2019.100255.
[12] S. Ullah, M. D. Awan, and M. Sikander Hayat Khiyal, “Big Data in Cloud Computing: A Resource Management Perspective,” Sci Program, vol. 2018, 2018, doi: 10.1155/2018/5418679.
[13] S. S. Mohsin et al., “AI-Powered IoMT Framework for Remote Triage and Diagnosis in Telemedicine Applications,” Al-Iraqia Journal for Scientific Engineering Research, vol. 4, no. 1, pp. 61–76, Mar. 2025, doi: 10.58564/IJSER.4.1.2025.294.
[14] E. Dritsas and M. Trigka, “Supervised Machine Learning Models for Liver Disease Risk Prediction,” Computers 2023, Vol. 12, Page 19, vol. 12, no. 1, p. 19, Jan. 2023, doi: 10.3390/COMPUTERS12010019.
[15] I. Straw and H. Wu, “Investigating for bias in healthcare algorithms: a sex-stratified analysis of supervised machine learning models in liver disease prediction,” BMJ Health Care Inform, vol. 29, no. 1, p. 100457, Apr. 2022, doi: 10.1136/BMJHCI-2021-100457.
[16] P. Kumar and R. S. Thakur, “Liver disorder detection using variable- neighbor weighted fuzzy K nearest neighbor approach,” Multimed Tools Appl, vol. 80, no. 11, pp. 16515–16535, May 2021, doi: 10.1007/S11042-019-07978-3/FIGURES/5.
[17] D. Gan, J. Shen, B. An, M. Xu, and N. Liu, “Integrating TANBN with cost sensitive classification algorithm for imbalanced data in medical diagnosis,” Comput Ind Eng, vol. 140, p. 106266, Feb. 2020, doi: 10.1016/J.CIE.2019.106266.
[18] S. Sreejith, H. Khanna Nehemiah, and A. Kannan, “Clinical data classification using an enhanced SMOTE and chaotic evolutionary feature selection,” Comput Biol Med, vol. 126, p. 103991, Nov. 2020, doi: 10.1016/J.COMPBIOMED.2020.103991.
[19] M. A. Kuzhippallil, C. Joseph, and A. Kannan, “Comparative Analysis of Machine Learning Techniques for Indian Liver Disease Patients,” 2020 6th International Conference on Advanced Computing and Communication Systems, ICACCS 2020, pp. 778–782, Mar. 2020, doi: 10.1109/ICACCS48705.2020.9074368.
[20] A. Anagaw and Y. L. Chang, “A new complement naïve Bayesian approach for biomedical data classification,” J Ambient Intell Humaniz Comput, vol. 10, no. 10, pp. 3889–3897, Oct. 2019, doi: 10.1007/S12652-018-1160-1/TABLES/8.
[21] M. Banu Priya, P. Laura Juliet, and P. R. Tamilselvi, “Performance Analysis of Liver Disease Prediction Using Machine Learning Algorithms,” International Research Journal of Engineering and Technology, 2018, Accessed: Jan. 21, 2024. [Online]. Available: www.irjet.net
[22] T. Sumathi, V. Pragadeeswaran, and C. Chethankumar, “Prediction of Chronic Liver Cirrhosis Using Ensemble Classification Approach,” 2023 14th International Conference on Computing Communication and Networking Technologies, ICCCNT 2023, 2023, doi: 10.1109/ICCCNT56998.2023.10307455.

Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Abdulrahman Ahmed Jasim, Hajer Alwindawi, Layth Rafea Hazim

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.