Laboratory clinical decision support (CDS) typically relies on data from the electronic health record (EHR). The implementation of a sustainable, effective laboratory CDS program requires a commitment to standardization and harmonization of key EHR data elements that are the foundation of laboratory CDS. The direct use of artificial intelligence algorithms in CDS programs will be limited unless key elements of the EHR are structured. The identification, curation, maintenance, and preprocessing steps necessary to implement robust laboratory-based algorithms must account for the heterogeneity of data present in a typical EHR.
Key points
- •
Most artificial intelligence algorithms rely on predictors and outcomes that are structured. The lack of standards and low data quality can limit the direct use of EHR data in algorithm development.
- •
Standardization and harmonization of EHR dictionaries are a foundational step in creating a sustainable laboratory CDS program.
- •
The development of standardized approaches to extract data from the EHR, including FHIR and OpenCDS, may enable direct use of EHR data for CDS, but only if attention is paid to data quality and structure.
Introduction
The electronic health record (EHR) has become an increasingly important part of the way that clinical care is delivered. The types and density of data collected and stored in the EHR have increased to the point where nearly the entirety of a patient’s health history may be present in the EHR. However, despite the high data content, much of the EHR clinical data are challenging to use directly in artificial intelligence (AI) algorithms because of a lack of standardization.
The reuse of EHR data for secondary purposes is typically time and resource intensive in part because of the need to analyze and validate EHR data before its use in decision support algorithms. There have been recent studies demonstrating applications of secondary use of EHR data to predict outcomes that do not rely on harmonized or standardized EHR data, although it remains to be seen if these strategies can be used accurately and safely in clinical practice. The EHR provides numerous opportunities to provide laboratory-focused clinical decision support (CDS), including AI-based approaches. However, most applications of AI require the predictor variables to be structured. Although laboratory test results would seem to be inherently structured and therefore directly suitable for use, in practice there are numerous areas of the laboratory order and results build that require careful attention during build and maintenance to support CDS efforts.
Furthermore, the ability to deliver effective decision support necessitates not only attention to the EHR build, but also an infrastructure that supports monitoring, testing, and knowledge discovery.
EHR semantic interoperability is the ability of an EHR to provide another system, whether another EHR or an internal or external CDS algorithm, with data having an unambiguous, shared meaning.
,
From the standpoint of clinical predictive model development, a common consideration is whether two data elements (eg, results from two laboratory tests sharing the same name in the EHR but produced in different laboratories) are equivalent and can be treated as a single feature in the model. Likewise, models may include variables or structural features to account for related but not identical data elements. For example, in some models, it may be appropriate to include laboratory results from the same analyte measured using different analytic methods as a single feature but to also include a second variable in the model denoting the source of the test result; this way model might learn any differences between the methods that are relevant to the outcome of interest. However, such modeling is only feasible if the underlying data are properly structured and captured in association with key metadata.
The realization of semantic interoperability requires structuring of the underlying data and structuring of the messaging format. In terms of the messaging format, HL7 fast health care information resources (FHIR) has emerged as a de facto standard for the interchange of clinical information. EHR vendors and health care providers have developed transformations between EHRs and the corresponding FHIR representations. However, structuring the underlying data remains a challenge for EHRs, despite the widespread availability of standardized approaches to represent EHR data.
In this article we review the types of EHR data that are the basis of key aspects of laboratory CDS and potential AI algorithms. In addition to laboratory results, fundamental data in the EHR particularly relevant to laboratory CDS includes diagnosis, demographics, medical history, medications, and procedures. Because of the central role of laboratory testing in the diagnostic process, in this article we include a particular focus on the EHR data representing test orders, results, and diagnosis. We discuss the current state of standardization and the techniques for the extraction and use of laboratory and diagnosis data for AI-based CDS.
Reviews
There are no reviews yet.