Subject Area: Machine Learning
Diabetes is a global health concern with millions of new cases annually. Early detection of the disease can prevent its progression and complications. In this study, we developed a prediction model that uses diagnostic measurements to determine if a patient has diabetes. To improve the model's performance and accuracy, we explored different techniques instead of relying on a single algorithm or dataset, which may not be optimal for the input data or parameters. We employed Logistic Regression and Stacked Ensemble Technique, and two feature selection methods, using two datasets: the PIMA Indians Diabetes dataset and a dataset from Enugu State University Teaching Hospital. Our results show that ensemble methods improve accuracy and prediction compared to a single model. The highest accuracy achieved was 79% for Dataset 1, while employing the stacked ensemble model on Dataset 2 resulted in a 99% accuracy in predicting the blood sugar disease. Our study demonstrates the benefits of using multiple algorithms and ensemble techniques to develop accurate diabetes prediction models.