Sensors, Vol. 23, Pages 1161: Efficient Data-Driven Machine Learning Models for Cardiovascular Diseases Risk Prediction

In this study, a supervised ML-based methodology is presented through which we aim to design efficient prediction models for CVD manifestation, highlighting the SMOTE technique’s superiority. Detailed analysis and understanding of risk factors are shown to explore their importance and contribution to CVD prediction. These factors are fed as input features to a plethora of ML models, which are trained and tested to identify the most appropriate for our objective under a binary classification problem with a uniform class probability distribution. Various ML models were evaluated after the use or non-use of Synthetic Minority Oversampling Technique (SMOTE), and comparing them in terms of Accuracy, Recall, Precision and an Area Under the Curve (AUC). The experiment results showed that the Stacking ensemble model after SMOTE with 10-fold cross-validation prevailed over the other ones achieving an Accuracy of 87.8%, Recall of 88.3%, Precision of 88% and an AUC equal to 98.2%.
Source: Sensors - Category: Biotechnology Authors: Tags: Article Source Type: research