Page History

Assumptions Guidance in this section are appliable to Interventions, Events, and Findings class domains and will be used with domain-specific assumptions as appropriate.General assumptions describes conventions for the population of values in tabulation records and variables are provided in this section. Assumptions Conventions in this section will be followed and complement more detailed assumptions provided in Domain Specificationsare both general and provided by general observation class. When conventions are applicable to TIG Nonclinical and Product Impact on Individual Health use cases, this is denoted in the Implementation column.

The following assumptions will be implemented for Interventions class domains.are general conventions for variable population:

Metadataspec

Num

Tabulation Variable UsePopulation

Implementation

1

Text strings greater than 200 characters

--YN

Variables with the question text "Were there any interventions?" (e.g., “Were there any concomitant medications?") support the cleaning of data and confirmation that there are no missing values.
Values collected for these fields will not be represented in subsequent tabulation datasets.

2

--CAT, --SCAT

Categories and subcategories are determined per protocol design and values are generally not entered via CRF.
Implementers may:
- Pre-populate and display category values to help individuals involved in data collection understand what data should be recorded on the CRF.
- Pre-populate hidden variables with the values assigned within their operational database.
- Populate values directly in the tabulation dataset during dataset creation.

3

Variables for Date and Time

The time an intervention started will be collected if there is a scientific or regulatory reason to collect this level of detail and the time can be realistically determined.
Collection variables for date (e.g., --DAT, --STDAT, --ENDAT) will be concatenated with collection variables for time (e.g., --TIM, --STTIM, --ENTIM) as applicable to populate tabulation variables for dates (e.g., --DTC, --STDTC, --ENDTC) using ISO 8601 format.

4

-- REASND

--REASND is used with tabulation variable --STAT.
The value "NOT DONE" in --STAT indicates that the subject was not questioned about the intervention or that data were not collected; it does not mean that the subject had no interventions.

5

--SPID

--SPID may be populated by the applicant's data collection system.
If collected, --SPID it can be used as an identifier in a data query to communicate clearly to individuals involved in data collection the record in question.

6

Coding

When free-text intervention treatments are recorded, the location may be included in the --TRT variable to facilitate coding (e.g., lung biopsy). Location may be collected when the applicant needs to identify the specific anatomical location of the intervention. This location information does not need to be removed from the verbatim --TRT when creating tabulation datasets.
The nonstandard variables --ATC1 through --ATC5 and --ATC1CD through --ATC5CD are used only when the intervention is coded using the World Health Organization's Anatomical Therapeutic Chemical (ATC) classification system (https://www.who.int/medicines/regulation/): 1 = the anatomical main group, 2 = the therapeutic main group, 3 = the therapeutic/pharmacological subgroup, 4 = chemical/therapeutic/pharmacological subgroup, 5 = chemical substance.
The implementer may also add MedDRA coding elements as nonstandard variables (NSVs) to the Interventions domain if this dictionary is used for coding.

7

Location (--LOC) and related variables (--LAT, --DIR, -- PORTOT)

Applicants may collect location data using a subset list of controlled terminology on the CRF.
Applicants may pre-populate hidden variables with values assigned within their operational database.
There is currently some overlap across controlled terminology for LOC, LAT, and DIR. While the overlap exists, ensure that this overlap is not part of database design.

When text strings greater than 200 characters are collected, the following conventions for general observation class variables and SUPP-- datasets will be adhered to:

The first 200 characters of text should be stored in the parent domain variable and each additional 200 characters of text should be stored in a record in the SUPP-- dataset.
- When splitting a text string into several SUPP-- records, the text should be split between words to improve readability.
- The value of the first QNAM representing text over 200 characters will be the original domain variable name without any numeric suffix.
- The values for subsequent QNAMs will be sequential variable names, formed by appending a 1-digit integer, beginning with 1, to the original domain variable name. In cases where the standard domain variable name is already 8 characters in length, applicants will replace the last character with a digit when creating values for QNAM.
  - e.g., For Other Action Taken in Adverse Events (AEACNOTH), values for QNAM for the SUPPAE records would have the values AEACNOT1, AEACNOT2, AEACNOT3, and so on.
- The value for QLABEL should be the original domain variable label for all QNAM values.

2

"Yes/No" values

For variables where the response is "Yes" or "No", both "Y" and "N" will be populated for responses. This eliminates confusion regarding whether a blank response indicates "N" or is a missing value.
Some variables are collected or derived in a manner that allows only 1 response (e.g., a single checkbox for "Yes"). In situations such as these, where it is unambiguous to populate only the response of interest, only 1 value will be populated ("Y" or "N") and the alternate value will be blank.

3

--FOCID

Variable --FOCID is populated when a specific part of a subject or specimen is identified as a study-specific point of interest (e.g., injection site, biopsy site, treated site, region of the body).
When used, the variable serves as a cross-domain identifier for the study-specific focus of interest; any records relating to the same focus would have the same FOCID value.

4

--SEQ, --RECID

Variables --SEQ and --RECID are populated to explicitly identify domain records in different ways. Differences in variable population are described below.

--SEQ	--RECID
Values uniquelyidentify records for subjects within a domain.	Values uniquely identify records within a domain.
The relationship between records and values is not one-to-one. Values may change between versions of datasets. When a record is deleted, the value for the record may be reused to identify another record.	There is a one-to-one relationship between records and values. Values for records do not change between versions of datasets even when content is modified. When a record is deleted, the value for the record will not be reused to identify another record.
Variable is numeric with numeric values.	Variable is character with numeric, character, or alphanumeric values.
Conventions for establishing and maintaining values are applicant-defined. Values may or may not be sequential depending on data processes and sources.

5

--GRPID

The value of --GRPID is generally assigned during or after data collection at the discretion of the applicant.

6

--REFID

Values for --REFID are applicant-defined and can be any alphanumeric strings the applicant chooses, consistent with their internal practices.

7

--CAT, --SCAT

Values for --CAT and/or--SCAT are known (identified) about the data before it is collected.
Variable --SCAT will be populated only when there is a value in variable --CAT.
Values for --CAT and --SCAT will not be the domain name or dictionary classification represented in --DECOD and --BODSYS.

8

--STAT

In general observation class domains, --STAT will be populated with "NOT DONE" when data are not collected for the topic of the observation.

The following are conventions for variable population in Interventions and The following assumptions will be implemented for Events class domains.

Metadataspec

Num

Field or

Variable Population

Guidance

Implementation
1

--YN

Prespecified interventions and events (--PRESP, --OCCUR, --STAT, REASND)

Product Impact on Individual Health only:

Interventions

Variables with the question text "Were there any <events>?"

(e.g.,

“Were there any adverse events?”) support the cleaning of data and confirmation that there are no missing values.

These questions can be used on any CRF.

Values collected for these fields will not be represented in subsequent tabulation datasets.

2--CAT, --SCAT

Categories and subcategories are determined per protocol design and values are generally not entered via CRF.
Implementers may:
- Pre-populate and display category values to help individuals involved in data collection understand what data should be recorded on the CRF.
- Pre-populate hidden variables with the values assigned within their operational database.
- Populate values directly in the tabulation dataset during dataset creation.

3Variables for Date and Time

The time of an event will be collected if there is a scientific or regulatory reason to collect this level of detail and the time can be realistically determined.
Collection variables for date (e.g., --DAT, --STDAT, --ENDAT) will be concatenated with collection variables for time (e.g., --TIM, --STTIM, --ENTIM) as applicable to populate tabulation variables for dates (e.g., --DTC, --STDTC, --ENDTC) using ISO 8601 format.

4--OCCUR

--OCCUR may be used when a specific event is solicited (preprinted) on the CRF and the CRF uses an applicant-defined codelist.
--OCCUR may be implemented while also allowing for a "NOT DONE" response.

5--REASND

--REASND is used with tabulation variable --STAT.
The value "NOT DONE" in --STAT indicates that the subject was not questioned about the event or that data were not collected; it does not mean that the subject had no events.

6--SPID

--SPID may be populated by the applicant's data collection system.
If collected, --SPID it can be used as an identifier in a data query to communicate clearly to individuals involved in data collection the record in question.

7Coding

The collection variables used for coding are not data collection fields that will appear on the CRF. Applicants will populate values through the coding process.
When free-text event terms are entered, the location may be included in --TERM to facilitate coding and further clarify the event. This location information does not need to be removed from the verbatim term when creating tabulation datasets.
The CDASH variables --LLT, --LLTCD, --PTCD, --HLT, --HLTCD, --HLGT, --HLGTCD, --SOC, and --SOCCD are only applicable to events coded in MedDRA.

8Location (--LOC, --LAT, --DIR, --PORTOT)

Location is collected when the applicant needs to identify the specific anatomical location of the event.
Applicants may collect location data using a subset list of controlled terminology on the CRF.
Applicants may pre-populate hidden variables with values assigned within their operational database.
There is currently some overlap across controlled terminology for LOC, LAT, and DIR. While the overlap exists, ensure that this overlap is not part of database design.

concomitant medications) and events (e.g., medical history) can be collected as responses to a prespecified list of treatments or terms. In such cases:

--PRESP represents when topic variable values, specific interventions (--TRT), or events (–TERM) were prespecified at the time of data collection. Values will be "Y" (for "Yes") or a null value.
--OCCUR represents whether prespecified interventions or events occurred or did not occur. Values will be populated for prespecified interventions and events only. Possible values are "Y" and "N" (for "Yes" and "No"). When an intervention or event is not prespecified, the value of --OCCUR will be null.
--STAT and --REASND can be used to provide information about prespecified interventions and events for which there is no response (e.g., investigator forgot to ask). In such cases the value of --STAT will be "NOT DONE" and the value of --REASND will be the reason when collected.

The following table shows the population of --PRESP, --OCCUR, --STAT, and --REASND for different data collection scenarios.

Collection Scenario	--PRESP Value	--OCCUR Value	--STAT Value	--REASND Value
An intervention or event was prespecified at the time of collection and occurred.	Y	Y
An intervention or event was prespecified at the time of collection and did not occur.	Y	N
An intervention or event was prespecified at the time of collection with no response and no reason collected.	Y		NOT DONE
An intervention or event was prespecified at the time of collection with no response and reason collected.	Y		NOT DONE	Forgot to ask.
A spontaneously reported intervention or event was collected.

2

Reason for an action or activity

For Interventions class domains, --INDC will represent the medical condition for which the intervention was given and --ADJ will represent the reason for an adjustment to exposure, when collected.
For Events class domains, reasons for performing an activity will be represented using nonstandard variable(s) in the SUPP-- dataset with QNAM = --REAS.

Page tree

Versions Compared

Old Version 5

New Version Current

Key