DOVE-FELM: A fusion-optimized feature selection and heterogeneous ensemble learning framework for early prediction of chronic kidney disease risk

  • Bilal Sowan
  • , Li Zhang
  • , Essam H. Houssein
  • , Hazem Qattous
  • , Mohammad Azzeh
  • , Bayan Massad

Research output: Contribution to journalArticlepeer-review

Abstract

Chronic Kidney Disease (CKD) affects over 800 million individuals worldwide, yet existing prediction models deliver suboptimal performance on imbalanced datasets and lack clinical interpretability for early detection. This study presents DOVE-FELM, a framework that integrates Diverse Optimization Voting Ensemble (DOVE) with Fusion Ensemble Learning Model (FELM) for early-stage CKD risk prediction. DOVE employs consensus-based feature selection through eight heterogeneous optimization algorithms and incorporates a Symptom-Weighted and Influenced Patient Network (SWIPN) to capture patient–symptom connectivity patterns. FELM combines Random Forest and REPtree classifiers through adaptive weighting optimized via meta-level learning. The framework was validated on six imbalanced medical datasets (imbalance ratios from 1.67:1 to 29.06:1) through 10-fold stratified cross-validation. Specifically, evaluated using the UCI CKD dataset, DOVE-FELM achieved 99.75%, 99.8%, 99.8%, 99.8%, and 0.9947 for accuracy, AUC, sensitivity, specificity, and Cohen’s Kappa scores, respectively. The Wilcoxon signed-rank test further ascertains statistical significance of the proposed model over nine baseline methods. Also, external validation on the Tawam Hospital CKD dataset with an imbalance ratio of 7.77:1, yielded an accuracy rate of 95.19%. Cross-disease validation on thyroid disease, thoracic surgery, cervical cancer, and AIDS clinical trials datasets demonstrated consistent performance (96.80–99.71% accuracy). Feature dimensionality reduction of 70.8% (24 → 7 biomarkers) enhanced clinical interpretability. DOVE-FELM advances computational frameworks for early chronic disease prediction through the integration of optimization-based feature selection with clinical domain knowledge. It shows strong potential for application in population-scale screening programs.
Original languageEnglish
Article number100613
JournalArray
Volume28
Early online date27 Nov 2025
DOIs
Publication statusPublished - Dec 2025

Cite this