Application of the Naive Bayes Algorithm in Stroke Risk Classification Using Patient Clinical Data
Abstract
Stroke is a serious medical condition and is one of the leading causes of death and long-term disability worldwide, including in Indonesia. The ability to predict stroke risk early can help in prevention efforts and timely medical intervention. This study applies the Naive Bayes classification algorithm to build a stroke risk prediction model. The dataset used in this study is 'healthcare-dataset-stroke-data' sourced from the Kaggle platform, including 5,110 patient data with 11 relevant clinical and demographic attributes, such as age, gender, hypertension, heart disease, average glucose level, body mass index (BMI), and smoking status. The Naive Bayes method was chosen because of its computational efficiency, its ability to handle high-dimensional data, and its solid performance in many medical diagnostic applications. The research process includes several stages: data preprocessing to handle missing values and discretize continuous attributes, implementation of the Naive Bayes algorithm by calculating prior probabilities and likelihood probabilities for each attribute against the target class (stroke and non-stroke), and classification on the test data. The results of the study indicate that the Naive Bayes model is capable of classifying stroke risk using the evaluation metrics discussed below. Analysis of the likelihood probability table also confirmed that factors such as age, hypertension, and heart disease significantly influence the prediction. This study demonstrates the potential of Naive Bayes as a practical and informative initial screening tool for healthcare practitioners.
Full Text:
PDFRefbacks
- There are currently no refbacks.