Handling of derived imbalanced dataset using XGBoost for identification of pulmonary embolism —a non-cardiac cause of cardiac arrest

AbstractRelationship between pulmonary embolism and heart failure is presented in this paper. The proposed research is divided into two phases. The first phase includes the establishment of a novel database with the help of a Cleveland ’s database for cardiology in order to establish a link between pulmonary embolism and heart failure. The connectivity is based on the relationship between the stroke volume and the pulse pressure (Pp <  25% (ap_hi)). The second phase includes the applicability of machine learning on the novel database. Novel database formed in this work is imbalanced, resulting in the overfitting problem. XGBoost has been used to get rid of overfitting problem. Efficiency has been increased by formulating an ensemble technique by combining extreme learning machines, IB3 tree, logistic regression, and averaged neural network (avNNet) models.Graphical abstract
Source: Medical and Biological Engineering and Computing - Category: Biomedical Engineering Source Type: research