Classification of Forensic Autopsy Reports through Conceptual Graph-based Document Representation Model

This study employs text categorization techniques for the classification of open narrative forensic autopsy reports. One of the key steps in text classification is document representation. In document representation, a clinical report is transformed into a format that is suitable for classification. The traditional document representation technique for text categorization is the bag-of-words (BoW) technique. In this study, the traditional BoW technique is ineffective in classifying forensic autopsy reports because it merely extracts frequent but discriminative features from clinical reports. Moreover, this technique fails to capture word inversion, as well as word-level synonymy and polysemy, when classifying autopsy reports. Hence, the BoW technique suffers from low accuracy and low robustness unless it is improved with contextual and application-specific information. To overcome the aforementioned limitations of the BoW technique, this research aims to develop an effective conceptual graph-based document representation (CGDR) technique to classify 1500 forensic autopsy reports from four (4) manners of death (MoD) and sixteen (16) causes of death (CoD). Term-based and Systematized Nomenclature of Medicine–Clinical Terms (SNOMED CT) based conceptual features were extracted and represented through graphs. These features were then used to train a two-level text classifier. The first level classifier was responsible for predicting MoD. In addition, the second level classifier ...
Source: Journal of Biomedical Informatics - Category: Information Technology Source Type: research