Secondary Use of Healthcare Structured Data: The Challenge of Domain-Knowledge Based Extraction of Features.

The objectives of this paper are: 1) to propose an updated representation of data reuse in healthcare, 2) to illustrate methods and objectives of feature extraction, and 3) to discuss the place of domain-specific knowledge. MATERIAL AND METHODS: an updated representation is proposed. Then, a case study consists of automatically identifying acute renal failure and discovering risk factors, by secondary use of structured data. Finally, a literature review published par Meystre et al. is analyzed. RESULTS: 1) we propose a description of data reuse in 5 phases. Phase 1 is data preprocessing (cleansing, linkage, terminological alignment, unit conversions, deidentification), it enables to construct a data warehouse. Phase 2 is feature extraction. Phase 3 is statistical and graphical mining. Phase 4 consists of expert filtering and reorganization of statistical results. Phase 5 is decision making. 2) The case study illustrates how time-dependent features can be extracted from laboratory results and drug administrations, using domain-specific knowledge. 3) Among the 200 papers cited by Meystre et al., the first and last authors were affiliated to health institutions in 74% (68% for methodological papers, and 79% for applied papers). DISCUSSION: features extraction has a major impact on success of data reuse. Specific knowledge-based reasoning takes an important place in feature extraction, which requires tight collaboration between computer scientists, statist...
Source: Studies in Health Technology and Informatics - Category: Information Technology Tags: Stud Health Technol Inform Source Type: research