How To Determine Where Data Belong

The scientific subject matter of the data and related activities such as data collection, data tabulation, data analysis, and data exchange drive which standards to implement. Implementation of standards starts with determining which standards are appropriate based on the nature of the data and activities to be supported. After standards are selected, it is then possible to determine how the data are collected, represented, or exchanged using the standards.

Standards in this guide are aligned with both use cases and activities. Given this, determining which standards to use may begin by selecting standards for the use case and activity to be supported. It is recommended that all guidance is reviewed, both higher level and detailed, prior to implementing standards. For ease of use, the table below presents use cases, activities, and corresponding sections in this guide which provide detailed instructions for implementation. Detailed instructions referenced are:

Section x.x, Standards for Collection, to guide development and use of case report forms (CRFs) by implementing the CDISC CDASH Model.
Section x.x, Standards for Tabulation, to guide organization of data collected, assigned, or derived for a study by implementing SDTM.
Section x.x, Standards for Analysis, to specify the principles to follow in the creation of analysis datasets and associated metadata by implementing the ADaM.
Section x.x, Standards for Data Exchange, to support sharing of structured data between parties and across different information systems by implementing specified standards and resources.

Use case	Data Collection	Data Tabulation	Data Analysis	Data Exchange
Product Description		Section x.x, Standards for Tabulation	Section x.x, Standards for Analysis	Section x.x, Standards for Data Exchange
Nonclinical
Product Impact on Individual Health	Section x.x, Standards for Collection		Section x.x, Standards for Analysis
Product Impact on Population Health			Section x.x, Standards for Analysis

Once standards are selected based on the use case and activity, the scientific subject matter of the data, its role, and analysis needs will determine where data belong, i.e., how the data are collected, represented, or exchanged using the standards. Standards for collection and tabulation collect and represent data in groupings of logically related data called domains. Domains are aligned between collection and tabulation standards to facilitate the transition of collected observations to their representation in tabulation datasets. Standards for analysis are based on analysis requirements with the structure of tabulation datasets facilitating the generation of analysis datasets.

To use standards for collection and tabulation, compare the nature or role of the data to the scope of a domain. A domain standard may be used when the nature of the data and the domain scope are aligned. Observations will be collected using collection standards and represented as rows in tabulation datasets. Each observation is described by a series of data points which correspond to data collection fields and columns in a tabulation dataset.

Fields and variables may be used when

To use standards for analysis,

Observations are described using collection and tabulation variables.

are used to design customized datasets is cucustomizable per analysis requirements.

Analysis needs drive the selection of analysis standards, where analysis requirements are supported by both the structure and the contents of the resulting dataset.

further drives the selection of collection and tabulation standards, where the nature of the data must be within the scope of a domain for the domain to be used. Analysis needs drive the selection of analysis standards, where analysis requirements are supported by both the structure and the contents of the resulting dataset. Standards for data exchange are applicable to all use cases and support sharing of standard CRFs, tabulation datasets, and analysis datasets.

	Standards for Collection	Standards for Tabulation	Standards for Analysis
Organization of Data	Groups logically related data in domains Domains are aligned

Standards for collection and tabulation group logically related data in domains while All standards are designed to flow together - The table below describes

The logic of the relationship may pertain to the scientific subject matter of the data or to its role in the trial.

Domains

he CDASH Model aligns with and is structured similarly to the SDTM. The CDASH Model organizes data into classes, which represent meaningful groupings of data in clinical research. It defines CDASH metadata for identifier variables, timing variables, general observation class variables (Events, Interventions, and Findings), domain-specific variables, and special-purpose domain variables.

The current CDASH Model represents Identifier, Timing, and Domain-specific variables in the metadata as an Observation Class. This does not align with the SDTM. This will be revised in a future version of the CDASH Model.

Connections between standards

Determiining the standard to use

Observations and Variables - SDTMIG v3.4 - Wiki (cdisc.org) The SDTMIG for Human Clinical Trials is based on the SDTM’s general framework for organizing clinical trial information that is to be submitted to regulatory authorities. The SDTM is built around the concept of observations collected about subjects who participated in a clinical study.

Datasets and Domains - SDTMIG v3.4 - Wiki (cdisc.org)

A domain is defined as a collection of logically related observations with a common topic.

Each domain dataset is distinguished by a unique, 2-character code that should be used consistently throughout the submission. This code, which is stored in the SDTM variable named DOMAIN, is used in 4 ways: as the dataset name, as the value of the DOMAIN variable in that dataset, as a prefix for most variable names in that dataset, and as a value in the RDOMAIN variable in relationship tables (see Section 8, Representing Relationships and Data).

Data represented in SDTM datasets include data as originally collected or received, data from the protocol, assigned data, and derived data.

Datasets and Domains - SENDIG v3.1.1 - Wiki (cdisc.org)

Test results, examinations, and observations for subjects in a nonclinical study are represented in a series of SEND domains. A domain is defined as a collection of logically related observations with a common topic. The logic of the relationship may pertain to the scientific subject matter of the data or to its role in the study.

Although the domain name is carefully selected, it is the structures and specifications within the domain that drive placement of data. It is important to note that the domain structure is only used for organizational purposes. The --TEST and --METHOD variable entries in the domain contribute to the identification of the test performed and the conditions under which the test was performed; the domain name or organization is not intended to imply any of this information.

Each domain dataset is distinguished by a unique, 2-character code that should be used consistently throughout the submission. This code, which is stored in the SDTM variable named DOMAIN, is used in 4 ways: as the dataset name, as the value of the DOMAIN variable in that dataset, as a prefix for most variable names in that dataset, and as a value in the RDOMAIN variable in relationship tables (see Section 8, Representing Relationships and Data).

Datasets and Domains - SENDIG v3.1.1 - Wiki (cdisc.org)

When determining which general-observation class domain model is appropriate for reporting specific observations, refer to the domain definition included in the Assumptions section for each domain model (see Section 6, Domain Models Based on the General Observation Classes).

What or who is a subject

Add language here

Determining Where Data Belong

Domains

SDTM

Observations about study subjects are normally collected for all subjects in a series of domains. A domain is defined as a collection of logically related observations with a common topic. The logic of the relationship may pertain to the scientific subject matter of the data or to its role in the trial. Each domain is represented by a single dataset.

Each domain dataset is distinguished by a unique, 2-character code that should be used consistently throughout the submission. This code, which is stored in the SDTM variable named DOMAIN, is used in 4 ways: as the dataset name, as the value of the DOMAIN variable in that dataset, as a prefix for most variable names in that dataset, and as a value in the RDOMAIN variable in relationship tables (see Section 8, Representing Relationships and Data).

SEND

Aside from a limited number of special-purpose domains, all subject-level SDTM datasets are based on 1 of the 3 general observation classes. When faced with a set of data that were collected and that "go together" in some sense, the first step is to identify SDTM observations within the data and the general observation class of each observation. Once these observations are identified at a high level, 2 other tasks remain:

Determining whether the relationships between these observations need to be represented using GRPID within a dataset, as described in Section 8.1, (SENDIG v3.1.1) Relating Groups of Records Within a Domain Using the --GRPID Variable, or using RELREC between datasets, as described in Section 8.3, (SENDIG v3.1.1) Supplemental Qualifiers - SUPP-- Datasets
Placing all the data items in 1 of the identified general observation class records, or in a SUPP-- dataset, as described in Section 8.5, (SENDIG v3.1.1) Relating Findings To Multiple Subjects - Subject Pooling

In practice, considering the representation of relationships and placing individual data items may lead to reconsidering the identification of observations, so the whole process may require several iterations.

ADD MORE TEXT HERE

Page tree

How To Determine Where Data Belong