The scientific subject matter of the data and related activities such as data collection, data tabulation, data analysis, and data exchange drive which standards to implement. Implementation of standards in this guide starts with determining which data standards should be used based on the nature of the data and activities to be supported. After an applicable set or sets of data standards have been identified, it is then possible to determine how the data are collected, represented, or exchanged using the standards.
Sets of data standards in this guide are aligned with both use cases and activities. Given this, determining which standards to use may begin by selecting standards for the use case and activity to be supported. All TIG guidance—both general and detailed—should be reviewed prior to implementing standards. The following table presents use cases, activities, and corresponding sections in this guide that provide detailed instructions for implementation. Detailed instructions referenced include:
- Section 2.7, Standards for Collection, which guides development and use of CRFs by implementing the CDISC CDASH Model
- Section 2.8, Standards for Tabulation, which guides organization of data collected, assigned, or derived for a study by implementing SDTM
- Section 2.9, Standards for Analysis, which specifies the principles to follow in creating analysis datasets and associated metadata by implementing ADaM
- Section 2.10, Standards for Data Exchange, which supports sharing of structured data between parties and across different information systems by implementing specified standards and resources
To use standards for collection and tabulation, compare the nature or role of the data to the scope of a domain. Domain names provide short descriptions of intended scope and may be used to narrow down which domains to consider. A domain standard may be used when the nature of the data and the domain scope are aligned. Observations will be collected using standardized collection fields when applicable and represented as rows in tabulation datasets. Each observation is described by a series of data points, which correspond to applicable data collection fields and variables in a tabulation dataset. A data collection field and/or tabulation variable may be used when the subject matter of a data point and the scope of a field and/or variable are aligned. The majority of data for a submission will be in scope for domains based on the General Observation Classes and a subset of Special Purpose domains described in the SDTM. Given this, referring to both the CDASH Model when applicable and the SDTM is highly recommended when using domains to support understanding of intended scope and to inform extensions and creation of custom domains when needed.
The design of analysis datasets is generally driven by the scientific and medical objectives of the study. A fundamental principle is that the structure and content of analysis datasets must support clear, unambiguous communication of the scientific and statistical aspects of the study. The purpose of ADaM is to provide a framework that both enables analysis of the data and allows reviewers and other recipients of the data to have a clear understanding of the data’s lineage from collection to analysis to results. ADaM provides the core and defines the spirit and intent of its concepts and standards. The model outlines the fundamental principles to follow in constructing analysis datasets and related metadata. Four types of metadata—analysis dataset metadata, analysis variable metadata, analysis parameter value-level metadata, and analysis results metadata—are described in ADaM. To establish which components are required in a submission, review current relevant files provided by the agency to which the submission is being sent. Other relevant documentation might include the study protocol, the statistical analysis plan (SAP), mock shells that define desired outputs, and any dataset specifications that have been defined.
Standards for data exchange are applicable to all use cases and support sharing of standard CRFs developed using collection standards, tabulation datasets generated using tabulation standards, and analysis datasets designed using analysis standards.