ENSEMBLE LEARNING-BASED PREDICTION OF STROKE RISK
No Thumbnail Available
Date
2024-07-12
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Hawassa University
Abstract
A stroke is a potentially fatal illness that results from insufficient blood flow to a portion
of the brain or bursts of arteries. It is the leading cause of disability and ranks second
globally in terms of causes of death. Stroke is currently one of the most common reasons
for hospital admission in many healthcare facilities and has become a serious public
health concern in Ethiopia. Early prediction is necessary to reduce death and disability.
Additionally, as risk factors for stroke include where you live, your lifestyle, your diet,
the temperature, the environment, and socioeconomic issues, it is important to investigate
the risk of stroke in different geographic places. The study aims to predict stroke risk
using three ensemble learning models.Random Forest, XGBoost, and LightGBM are used
in this study to predict stroke risk across the study area.The collected data is integrated,
cleaned, normalization, the missing data is handled, and Synthetic Minority Over sampling Technique (SMOTE) is used to handle a class imbalance in the data before
evaluation started, Grid search technique is also used to find best performances of the
models. The model is evaluated with accuracy, precision, recall, F1-score, and confusion
matrix, and a correlation graph is also used to capture the relationship of the attributes.
Random Forest had the maximum accuracy of 97.6% among models, followed by
XGBoost at 96.1% and LightGBM at 92.9%.The study found that Discontinuation of Anti
Hypertensive drug is the major risk factor for Stroke in the study
Description
Keywords
Stroke, Stroke Risk Prediction, Ensemble learning, Random Forest, XGBoost, LightGBM
