...
Metadataspec |
---|
Num | Tabulation Variable Use | Implementation |
---|
1 | Text Data Casing | - Variables subject to controlled terminology will be populated with the exact value for the controlled term, including term casing.
- Otherwise, text data will be represented in upper case (e.g., NEGATIVE).
| 2 | "Yes", "No", Values | - For variables where the response is "Yes" or "No", both "Y" and "N" will be
- Variables where the response is "Yes" or "No" ("Y" or "N") should normally be populated for both "Y" and "N" responses. This eliminates confusion regarding whether a blank response indicates "N" or is a missing value. However, some variables are collected or derived in a manner that allows only 1 response, such as when a single checkbox indicates "Yes". In situations such as these, where it is unambiguous to populate only the response of interest, it is permissible to populate only 1 value ("Y" or "N") and leave the alternate value blank. An example of when it would be acceptable to use only a value of "Y" would be for Last Observation Before Exposure Flag (--LOBXFL) variables, where "N" is not necessary to indicate that a value is not the last observation before exposure.
| 3 | 4 | 5 | 6 | 7 |
|
Assumptions in this section are appliable to Interventions, Events, and Findings class domains and will be used with domain-specific assumptions as appropriate.
General assumptions for the population of values in tabulation variables are provided in this section. Assumptions in this section will be followed and complement more detailed assumptions provided in Domain Specifications.
-- | Values for --REFID are sponsor-defined and can be any alphanumeric strings the sponsor chooses, consistent with their internal practices.
- The sequence number (--SEQ) variable uniquely identifies a record for a given USUBJID within a domain. The variable --SEQ is required in all domains except DM. For example, if a subject has 25 observations in the Vital Signs (VS) domain, then 25 unique VSSEQ values should be established for this subject. Conventions for establishing and maintaining --SEQ values are applicant-defined. Values may or may not be sequential depending on data processes and sources
|
|
...
The following assumptions will be implemented for Interventions class domains.
Metadataspec |
---|
Num | Tabulation Variable Use | Implementation |
---|
1 | --YN | - Variables with the question text "Were there any interventions?" (e.g., “Were there any concomitant medications?") support the cleaning of data and confirmation that there are no missing values.
- Values collected for these fields will not be represented in subsequent tabulation datasets.
| 2 | --CAT, --SCAT | - Categories and subcategories are determined per protocol design and values are generally not entered via CRF.
- Implementers may:
- Pre-populate and display category values to help individuals involved in data collection understand what data should be recorded on the CRF.
- Pre-populate hidden variables with the values assigned within their operational database.
- Populate values directly in the tabulation dataset during dataset creation.
| 3 | Variables for Date and Time | - The time an intervention started will be collected if there is a scientific or regulatory reason to collect this level of detail and the time can be realistically determined.
- Collection variables for date (e.g., --DAT, --STDAT, --ENDAT) will be concatenated with collection variables for time (e.g., --TIM, --STTIM, --ENTIM) as applicable to populate tabulation variables for dates (e.g., --DTC, --STDTC, --ENDTC) using ISO 8601 format.
| 4 | -- REASND | --REASND is used with tabulation variable --STAT. The value "NOT DONE" in --STAT indicates that the subject was not questioned about the intervention or that data were not collected; it does not mean that the subject had no interventions. |
|
| 5--SPID | - --SPID may be populated by the applicant's data collection system.
- If collected, --SPID it can be used as an identifier in a data query to communicate clearly to individuals involved in data collection the record in question.
| 6 | Coding | - When free-text intervention treatments are recorded, the location may be included in the --TRT variable to facilitate coding (e.g., lung biopsy). Location may be collected when the applicant needs to identify the specific anatomical location of the intervention. This location information does not need to be removed from the verbatim --TRT when creating tabulation datasets.
- The nonstandard variables --ATC1 through --ATC5 and --ATC1CD through --ATC5CD are used only when the intervention is coded using the World Health Organization's Anatomical Therapeutic Chemical (ATC) classification system (https://www.who.int/medicines/regulation/): 1 = the anatomical main group, 2 = the therapeutic main group, 3 = the therapeutic/pharmacological subgroup, 4 = chemical/therapeutic/pharmacological subgroup, 5 = chemical substance.
- The implementer may also add MedDRA coding elements as nonstandard variables (NSVs) to the Interventions domain if this dictionary is used for coding.
| 7 | Location (--LOC) and related variables (--LAT, --DIR, -- PORTOT) | - Applicants may collect location data using a subset list of controlled terminology on the CRF.
- Applicants may pre-populate hidden variables with values assigned within their operational database.
- There is currently some overlap across controlled terminology for LOC, LAT, and DIR. While the overlap exists, ensure that this overlap is not part of database design.
|
|
The following assumptions will be implemented for Events class domains.
...
- Variables with the question text "Were there any <events>?" (e.g., “Were there any adverse events?”) support the cleaning of data and confirmation that there are no missing values.
- These questions can be used on any CRF.
- Values collected for these fields will not be represented in subsequent tabulation datasets.
...
- Categories and subcategories are determined per protocol design and values are generally not entered via CRF.
- Implementers may:
- Pre-populate and display category values to help individuals involved in data collection understand what data should be recorded on the CRF.
- Pre-populate hidden variables with the values assigned within their operational database.
- Populate values directly in the tabulation dataset during dataset creation.
...
- The time of an event will be collected if there is a scientific or regulatory reason to collect this level of detail and the time can be realistically determined.
- Collection variables for date (e.g., --DAT, --STDAT, --ENDAT) will be concatenated with collection variables for time (e.g., --TIM, --STTIM, --ENTIM) as applicable to populate tabulation variables for dates (e.g., --DTC, --STDTC, --ENDTC) using ISO 8601 format.
...
- --OCCUR may be used when a specific event is solicited (preprinted) on the CRF and the CRF uses an applicant-defined codelist.
- --OCCUR may be implemented while also allowing for a "NOT DONE" response.
...
- --REASND is used with tabulation variable --STAT.
- The value "NOT DONE" in --STAT indicates that the subject was not questioned about the event or that data were not collected; it does not mean that the subject had no events.
...
- --SPID may be populated by the applicant's data collection system.
- If collected, --SPID it can be used as an identifier in a data query to communicate clearly to individuals involved in data collection the record in question.
...
- The collection variables used for coding are not data collection fields that will appear on the CRF. Applicants will populate values through the coding process.
- When free-text event terms are entered, the location may be included in --TERM to facilitate coding and further clarify the event. This location information does not need to be removed from the verbatim term when creating tabulation datasets.
- The CDASH variables --LLT, --LLTCD, --PTCD, --HLT, --HLTCD, --HLGT, --HLGTCD, --SOC, and --SOCCD are only applicable to events coded in MedDRA.
...
Assumptions in this section are appliable to Interventions, Events, and Findings class domains and will be used with domain-specific assumptions as appropriate.
General assumptions for the population of values in tabulation variables are provided in this section. Assumptions in this section will be followed and complement more detailed assumptions provided in Domain Specifications.
...
The following assumptions will be implemented for Findings class domains.
...