STATUS: Completed Project
In order to achieve a sustainable data network infrastructure, promote interoperability, and foster the creation of a Learning Health System (LHS), there is a need to map and transform data across various Common Data Models (CDMs) and leverage open-source standards. By mapping various CDM data elements and leveraging existing PCORTF investments, it is feasible to reuse the data, methods, and other resources from each network thereby providing PCOR researchers with access to larger and more diverse types of observational data.
PROJECT PURPOSE & GOALS
This project was a collaborative effort among the FDA, NCI, NIH/NCATS, ONC, and the NLM. The project’s goal was to build data infrastructure for conducting patient-centered outcomes research (PCOR) using observational data derived from the delivery of health care in routine clinical settings. The sources of these data include, but are not limited to insurance billing claims, electronic health records (EHRs), and patient registries. The CDM organized data into a standard structure, which may differ across networks. This project harmonized several existing CDMs in order to support research and analyses across multiple data networks. The aim advanced the utility of data and its interoperability across networks to facilitate PCOR. The enhanced data infrastructure created through this project has the capacity to support evidence generation on patient-centered outcomes that can inform regulatory and clinical decision making within federal programs.
Develop common data architecture as the intermediary between four CDMs within four networks i.e., Sentinel, PCORNet, i2b2 and OHDSI.
Develop a flexible data model that can be used to create outbound data in multiple formats for multiple purposes.
Test the common data architecture by using it to study factors associated with the safety and effectiveness of newly approved oncology drugs that boost patients’ immune response to cancer. These drugs, known broadly as immune checkpoint inhibitors, are gaining approvals in a number of different indications, but it is unclear what the safety of these drugs may be in routine clinical care and how effectiveness may vary in different patient subpopulations, in combination with other effective agents for comorbid, such as those which treat autoimmune disorders. In this 2-year project, the team focused on three agents in the programmed cell death protein 1 (PD1)/ programmed death-ligand 1 (PDL1) class of oncology drugs with a focus on patients who have both cancer and an autoimmune condition. In order to validate this specific use case, the statistical tests and methods in the Sentinel and OHDSI libraries were applied to the mapped CDMs.
Establish methods and develop processes, policies, and governance for ongoing curation, maintenance, and sustainability of the common data architecture, building upon existing resources, standards, and tools. Example of existing resources include but are not limited to the Data Access Framework (DAF) developed by ONC to interface to various CDMs and the NIH Common Data Element Repository to register the harmonized, standardized data elements within each CDM.
PROJECT ACHIEVEMENTS & HIGHLIGHTS
- Along with leading an environmental scan of existing CDM artifacts, the FDA project team developed the oncology use case for the PCORnet 3.1 and 4.0 CDMs.
- The NIH/NCATS team surveyed the market for an existing open source extract, transform, and load (ETL) software tool to automate the data mapping process, and prepared a report on the selection process. The NIH/NCATS team also created a “Query Builder,” a front-end interface that offers researchers a simple way to construct and issue their research questions. “Query Transformation” transforms the query into a version that is compatible with each CDM. The CDM Harmonization Results Database and Viewer receives and analyzes the results of a query in one or more of the CDM formats. To process these results, the team created a tool that exports record level results in the Clinical Data Interchange Standards Consortium (CDISC) Study Data Tabulation Model (SDTM) format.
- The NIH/NCI team completed the metadata curation of four CDMs (Sentinel, PCORV4.0, OMOP, I2b2 ACT) and completed the registrations of the CDMs and Biomedical Research Integrated Domain Group (BRIDG) in the cancer Data Standards Registry and Repository (caDSR). The final CDM packages were sent to NIH/NLM and the four CDM common data elements (CDEs) have been uploaded to the NIH CDE Repository.
- ONC and NIH teams completed the mapping of the four data models, the NIH BRIDG conceptual model, and from BRIDG to Fast Healthcare Interoperability Resources (FHIR®). FHIR resource extensions and Common Data Models Harmonization (CDMH) Implementation Guide (IG) are in Health Level Seven (HL7) ballot reconciliation. Pilot-testing of the CDMH IG has been completed and the package has been passed for the first-round of balloting and is in reconciliation.
- The NIH/NLM team developed a governance framework document that outlines suggested policies and practices for access to and use of the real world data that are derived from data-sharing networks that connect CDMs.
PUBLICATIONS, PRESENTATIONS, AND OTHER PUBLICALLY AVAILABLE RESOURCES
- The project team produced a final project report, “Common Data Model Harmonization (CDMH) and Open Standards for Evidence Generation,” in August 2020. The report is available here: https://aspe.hhs.gov/sites/default/files/private/pdf/259016/CDMH-Final-Report-14August2020.pdf
- ONC developed a brief project summary that describes key project activities. The project summary is available here: https://www.healthit.gov/sites/default/files/page/2020-07/CDMH-Project-Summary.pdf
- The relevant data elements have been exported from the NIH/NCI database, the caDSR, and imported into the NIH CDE Repository, which provides access to structured human and machine-readable definitions of data elements that have been recommended or required by NIH Institutes and Centers. Access the data elements here: https://cde.nlm.nih.gov/cde/search?selectedOrg=NCI&classification=PCORTF%20CDMH
- You can also browse the CDEs here: https://cdebrowser.nci.nih.gov/cdebrowserClient/cdeBrowser.html#/search
- The CDMH FHIR IG: The IG will help researchers who want to use the project work to map and translate data into FHIR format. https://build.fhir.org/ig/HL7/cdmh/
- The SDTM Export Tool: The SDTM Export Tool can help researchers who want to export record-level results from the databases in the CDISC SDTM format to do analysis. https://www.cdisc.org/standards/foundational/sdtm
- The project team created several CDM mappings:
- CDMs-to- BRIDG Mapping: The mapping shows the alignment of data elements and the existing gaps. The mappings are available here: https://github.com/cdmhproject/cdmh. The mappings are also available here: https://cbiit.github.io/bridg-model/HTML/BRIDG5.3.1/index.htm?goto=12:585.
- CDMH/BRIDG-to-SDTM Mapping: The mapping provides the rules to export results in support of submissions to FDA. https://github.com/cdmhproject/cdmh
- CDMH/BRIDG-to-FHIR mapping: These mappings proves the alignment and gaps of the CDMs to existing HL7 FHIR resources. http://build.fhir.org/ig/HL7/cdmh/profiles.html
- BRIDG Model Updates: The BRIDG Model was updated with CDMH Data Elements promoting implementation strategies for use of BRIDG. https://bridgmodel.nci.nih.gov/downloadmodel/bridg-releases
- A visualization site has been established at NIH/NCI depicting the BRIDG and cross-model mappings. https://vis-review-si.nci.nih.gov/
- Data Governance Framework: The data governance framework document details policies and practices for accessing to and using of RWD derived from data-sharing networks. https://cde.nlm.nih.gov/resources
- Perceived Training and Data Science Support Needs for Use of Real World Data for Clinical Research: NIH/NLM conducted a survey of NIH intramural researchers to understand their use of real world data and need for support to utilize analytical tools. https://cde.nlm.nih.gov/resources
Below is a list of ASPE-funded PCORTF projects that are related to this project
Standardization and Querying of Data Quality Metrics and Characteristics for Electronic Health Data - Under the FDA, this project created and implemented a metadata standards data capture and querying system for: data quality and characteristics, data source and institutional characteristics, and “fitness for use.” This project targets the need to build bridges across networks and databases, so that information captured in each source can be combined and used for research.