A feature learning approach based on XGBoost for driving assessment and risk prediction.

This study designs a framework of feature extraction and selection, to assess vehicle driving and predict risk levels. The framework integrates learning-based feature selection, unsupervised risk rating, and imbalanced data resampling. For each vehicle, about 1300 driving behaviour features are extracted from trajectory data, which produce in-depth and multi-view measures on behaviours. To estimate the risk potentials of vehicles in driving, unsupervised data labelling is proposed. Based on extracted risk indicator features, vehicles are clustered into various groups labelled with graded risk levels. Data under-sampling of the safe group is performed to reduce the risk-safe class imbalance. Afterwards, the linkages between behaviour features and corresponding risk levels are built using XGBoost, and key features are identified according to feature importance ranking and recursive elimination. The risk levels of vehicles in driving are predicted based on key features selected. As a case study, NGSIM trajectory data are used in which four risk levels are clustered by Fuzzy C-means, 64 key behaviour features are identified, and an overall accuracy of 89% is achieved for behaviour-based risk prediction. Findings show that this approach is effective and reliable to identify important features for driving assessment, and achieve an accurate prediction of risk levels. PMID: 31154284 [PubMed - indexed for MEDLINE]
Source: Accident; Analysis and Prevention. - Category: Accident Prevention Authors: Tags: Accid Anal Prev Source Type: research