TY - JOUR
T1 - DOVE-FELM
T2 - A fusion-optimized feature selection and heterogeneous ensemble learning framework for early prediction of chronic kidney disease risk
AU - Sowan, Bilal
AU - Zhang, Li
AU - Houssein, Essam H.
AU - Qattous, Hazem
AU - Azzeh, Mohammad
AU - Massad, Bayan
PY - 2025/12
Y1 - 2025/12
N2 - Chronic Kidney Disease (CKD) affects over 800 million individuals worldwide, yet existing prediction models deliver suboptimal performance on imbalanced datasets and lack clinical interpretability for early detection. This study presents DOVE-FELM, a framework that integrates Diverse Optimization Voting Ensemble (DOVE) with Fusion Ensemble Learning Model (FELM) for early-stage CKD risk prediction. DOVE employs consensus-based feature selection through eight heterogeneous optimization algorithms and incorporates a Symptom-Weighted and Influenced Patient Network (SWIPN) to capture patient–symptom connectivity patterns. FELM combines Random Forest and REPtree classifiers through adaptive weighting optimized via meta-level learning. The framework was validated on six imbalanced medical datasets (imbalance ratios from 1.67:1 to 29.06:1) through 10-fold stratified cross-validation. Specifically, evaluated using the UCI CKD dataset, DOVE-FELM achieved 99.75%, 99.8%, 99.8%, 99.8%, and 0.9947 for accuracy, AUC, sensitivity, specificity, and Cohen’s Kappa scores, respectively. The Wilcoxon signed-rank test further ascertains statistical significance of the proposed model over nine baseline methods. Also, external validation on the Tawam Hospital CKD dataset with an imbalance ratio of 7.77:1, yielded an accuracy rate of 95.19%. Cross-disease validation on thyroid disease, thoracic surgery, cervical cancer, and AIDS clinical trials datasets demonstrated consistent performance (96.80–99.71% accuracy). Feature dimensionality reduction of 70.8% (24 → 7 biomarkers) enhanced clinical interpretability. DOVE-FELM advances computational frameworks for early chronic disease prediction through the integration of optimization-based feature selection with clinical domain knowledge. It shows strong potential for application in population-scale screening programs.
AB - Chronic Kidney Disease (CKD) affects over 800 million individuals worldwide, yet existing prediction models deliver suboptimal performance on imbalanced datasets and lack clinical interpretability for early detection. This study presents DOVE-FELM, a framework that integrates Diverse Optimization Voting Ensemble (DOVE) with Fusion Ensemble Learning Model (FELM) for early-stage CKD risk prediction. DOVE employs consensus-based feature selection through eight heterogeneous optimization algorithms and incorporates a Symptom-Weighted and Influenced Patient Network (SWIPN) to capture patient–symptom connectivity patterns. FELM combines Random Forest and REPtree classifiers through adaptive weighting optimized via meta-level learning. The framework was validated on six imbalanced medical datasets (imbalance ratios from 1.67:1 to 29.06:1) through 10-fold stratified cross-validation. Specifically, evaluated using the UCI CKD dataset, DOVE-FELM achieved 99.75%, 99.8%, 99.8%, 99.8%, and 0.9947 for accuracy, AUC, sensitivity, specificity, and Cohen’s Kappa scores, respectively. The Wilcoxon signed-rank test further ascertains statistical significance of the proposed model over nine baseline methods. Also, external validation on the Tawam Hospital CKD dataset with an imbalance ratio of 7.77:1, yielded an accuracy rate of 95.19%. Cross-disease validation on thyroid disease, thoracic surgery, cervical cancer, and AIDS clinical trials datasets demonstrated consistent performance (96.80–99.71% accuracy). Feature dimensionality reduction of 70.8% (24 → 7 biomarkers) enhanced clinical interpretability. DOVE-FELM advances computational frameworks for early chronic disease prediction through the integration of optimization-based feature selection with clinical domain knowledge. It shows strong potential for application in population-scale screening programs.
U2 - 10.1016/j.array.2025.100613
DO - 10.1016/j.array.2025.100613
M3 - Article
SN - 2590-0056
VL - 28
JO - Array
JF - Array
M1 - 100613
ER -