Identification and analysis of misclassified work-zone crashes using text mining techniques

The objective of this study is to develop a classifier that applies text mining techniques to quickly find missed work zone (WZ) crashes through the unstructured text saved in the crash narratives. The study used three-year crash data from 2017 to 2019. The data from 2017 to 2018 was used as training data, and the 2019 data was used as testing data. A unigram + bigram noisy-OR classifier was developed and proven to be an efficient and effective means of classifying work zone crashes based on key information in the crash narrative. The ad-hoc analysis of misclassified work zone crashes sheds light on when, where and the plausible reasons as to why work zone crashes are more likely to be missed.PMID:34126276 | DOI:10.1016/j.aap.2021.106211
Source: Accident; Analysis and Prevention. - Category: Accident Prevention Authors: Source Type: research