The Clinical Data Interchange Standards Consortium (CDISC) is a global, nonprofit standards development organization whose mission is to develop and support platform-independent data standards that enable information system interoperability in biomedical research and healthcare.
CDISC standards are widely used for regulatory study data submissions (e.g., study planning, data collection, tabulation, analysis) to the US Food and Drug Administration (FDA) and Japan Pharmaceuticals and Medical Devices Agency (PMDA), endorsed by the China Food and Drug Administration (CFDA), and requested for use by the European Innovative Medicines Initiative (IMI).
CDISC is funded by the support of over 400 member organizations including those from biotech, pharmaceutical, clinical research organizations, academia, regulatory agencies, healthcare, and grants.
- 1 Background: Study Data Standards
- 2 Semantics: CDISC Foundational Standards
- 3 Syntax: Transport Standards
- 4 Availability/Access
- 5 Domain Information Models
- 6 References
Background: Study Data Standards
Ideal study data standards describe a standard way to exchange clinical and nonclinical research data between computer systems. These standards provide a framework for organizing study data, including templates for datasets, names for variables, and ways of doing calculations with common variables. Having standard, uniform study data enables FDA scientists to explore many new research questions by combining data from multiple studies. Data standards also help FDA receive, process, review, and archive submissions more efficiently and effectively.
Currently, the FDA supports semantic and syntactic data standards from CDISC.
Semantics: CDISC Foundational Standards
CDISC Controlled Terminology is a set of all CDISC-developed or CDISC-adopted data items within CDISC-defined data sets. To support its data standards, CDISC collaborates with the National Cancer Institute (NCI) Enterprise Vocabulary Services (EVS). CDISC Terminology uses codes and terms from vocabularies from the National Cancer Institute (NCI) thesaurus; it is maintained and distributed by NCI EVS.
CDISC Controlled Terminology is comprised of the following foundational standards. Each terminology is accompanied by an implementation guide and model.
Study Data Tabulation Model (SDTM)
- SDTM provides a standard for organizing and formatting data to streamline processes in collection, management, analysis and reporting. SDTM intends to support data aggregation/warehousing, data mining and reuse, data sharing, data review – to improve the review and approval process. The following CDISC terminologies are both subsets of the SDTM terminology.
- -Questionnaires, Ratings, and Scales (QRS) Terminology: QRS is a vocabulary for study instruments. Each QRS instrument is a series of questions and tasks in qualitative or quantitate assessments of a clinical concept, or observation.
- -Clinical Data Acquisition Standards Harmonization (CDASH): CDASH establishes standardized data collection formats and structures that should map easily to SDTM-domains. This intends to allow for traceability of submission data into the SDTM, and for easier data review by regulators.
Analysis Data Model (ADaM)
- ADaM provides definitions for datasets and metadata standards. This intends to improve efficiency of review and replication of statistical analyses, and traceability of results and data represented in the SDTM.
Standard for the Exchange of Nonclinical Data (SEND)
- SEND specifies a way to collect and present nonclinical data in a consistent format. It is an implementation of the SDTM standard for nonclinical studies
- Protocol Terminology develops the semantics for research protocol entities, like study design, eligibility criteria, and requirements from the ClinicalTrials.gov, World Health Organization (WHO) registries, and EudraCT registries.
Coalition for Accelerating Standards and Therapies (CFAST) Therapeutic Area Standards
- NCI EVS has partnered with CDISC, the Critical Path Institute (C-Path), the FDA, TransCelerate BioPharma (TCB), and other national and international organizations to create CFAST.
Therapeutic Area (TA) Standards Standards intend to extend CDISC standard metadata with extensions to the data collection standard CDASH, the data submission standard SDTM, and the data analysis standard ADaM, and include CDISC controlled terminology managed as a subset of NCI Thesaurus.
- TA Standards extend the Foundational Standards to represent data that pertains to specific disease areas. As of 2016, CDISC had published over 25 TA standards including but not limited to: Breast Cancer, COPD, Diabetes, Coronary Artery Disease, and Malaria.
Syntax: Transport Standards
CDISC Transport Standards intend to enable the exchange of data from the CDISC Foundational Standards and Therapeutic Area extensions by providing the following machine-readable, platform-independent formats for research data. The data structures are in Extensible Markup Langauge (XML):
Operational Data Model (ODM)-XML
- Operational Data Model (ODM)-XML is a format for exchanging and archiving clinical and translational research data, along with associated metadata, administrative data, reference data, and audit information. ODM-XML facilitates the regulatory requirements of exchange of metadata and data. It has become the language of choice for representing case report form content in many electronic systems.
Clinical Trial Registry (CTR)-XML
- Clinical Trial Registry (CTR)-XML lets vendors implement tools based on a single XML file that holds the information needed to generate submissions primarily to the World Health Organization (WHO), European Medicines Agency (EMA) EudraCT Registry and United States ClinicalTrials.gov.
Study/Trial Design Model in XML (SDM-XML)
- Study/Trial Design Model in XML (SDM-XML) is an extension of ODM-XML and allows organizations to provide rigorous, machine-readable, interchangeable descriptions of the designs of their clinical studies, including treatment plans, eligibility and times and events. SDM-XML defines three key sub-modules – Structure, Workflow, and Timing – permitting various levels of detail in any representation of a clinical study’s design.
- Define-XML transmits metadata that describes any tabular dataset structure. When used with the CDISC Foundational standards, it provides the metadata for datasets using the SDTM or SEND standards and analysis datasets using ADaM.
- Dataset-XML supports the exchange of dataset data based on Define-XML metadata. Dataset-XML complements Define-XML
RDF (Resource Description Framework)
- CDISC Standards in RDF provides a representation of the CDISC Foundational standards in a model based on the Resource Description Framework (RDF). RDF provides executable, machine-readable CDISC standards from CDISC SHARE. This file format is a “linked data” view of the standards as an ontology.
The Laboratory Data Model (LAB)
- The Laboratory Data Model (LAB) provides a standard model for the acquisition and exchange of laboratory data, primarily between labs and sponsors. The LAB standard was specifically designed for the interchange of lab data acquired in clinical trials.
NCI File Transfer Protocol (FTP)
- CDISC Controlled Terminology is maintained and distributed as part of the NCI Thesaurus on an NCI File Transfer Protocol (FTP) site and is available for free download in Excel, text, odm.xml, pdf, html and OWL/RDF formats.
- CDISC SHARE provides machine-readable formats of CDISC standards in ODM, JSON, RDF, XLS, and XML, as well as implementation guides for the CDISC standards and controlled terminologies. There is a REST Application Programming Interface (API) for automated, programmatic use by implementers.
- CDISC standards are open and freely available as published PDFs on their website.
Domain Information Models
BRIDG (Biomedical Research Integrated Domain Group Model)
- BRIDG is an analysis model that represents protocol-driven clinical, pre-clinical, translational, and basic research. It is intended to be used by software architects and developers to design clinical information systems (e.g., software systems and databases), that share the same semantics. BRIDG uses datatypes to “bridge” CDISC foundational standards and research and healthcare concepts found in different systems.
- CDISC is working with EHR vendors, biopharma, academic medical centers, and the DFA for discovering novel methods of collecting healthcare data to be used in clinical research.
- The “Electronic Health Record to CDASH” (E2C) project showed that it is possible to extract data from the Continuity of Care Documents (CCD) to CDASH . This will hopefully streamline the design and data collection in clinical research from healthcare data. The project intends to extend the mappings to compare SDTM elements to data elements in FHIR (Fast Healthcare Interoperability Resource).
- This model aims to work on a new integration profile to add RESTful services to the Retrieve Form for Data Capture (RFD) profile. The intent is to streamline the use of FHIR resources for data extraction and exchange.
Submitted by Raja Cholan