Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Assumptions in this section are applicable to Interventions, Events, and Findings class domains and will be used with domain-specific assumptions as appropriate. 

Hidden variables are variables in an operational database that are used to collect values assigned by the applicant that are not entered via a CRF. Such values are pre-determined, fixed, and may or may not be displayed on the CRF as noneditable fields.

The following assumptions will be implemented for Interventions class domains.

Metadataspec
NumCollection Variable UseImplementation
1--YN
  • --YN ("Yes/No") questions are used to provide a definite answer. The absence of a response is ambiguous as it can mean "no," "none," or that the response is missing.
  • Variables

The SDTM class for a data collection field is specified in the Observation Class column in the metadata table for the domain. General guidance for data collection fields by class are provided in the tables below and will be 

The following should be used when the observation class for fields in a domain are Interventions.

Metadataspec
1Data Collection FieldGuidance
--YN variables
  • with the question text "Were there any interventions?" (e.g., “Were there any
procedures?”, “Were there any
  • concomitant medications?")
are intended to assist in
  • support the cleaning of data and
in confirming that there are no missing values. These variables are not included as part of the SDTM Intervention domains for submission and are annotated as NOT SUBMITTED on the CRF.--CAT and/or --SCATare generally not entered on the CRF by the sites. Implementers may pre-populate and display these category values to help the site
  • confirmation that entry of collected data is complete. 
  • Values collected for these fields will not be represented in subsequent tabulation datasets. 
2--CAT, --SCAT
  • Categories and subcategories are determined per protocol design and values are generally not entered via CRF.
  • Implementers may:
    • Prepopulate and display category values to help individuals involved in data collection understand what data should be recorded on the CRF.
Implementers may also prepopulate
    • Prepopulate hidden variables with the values assigned within their operational database.
Categories and subcategories are determined per protocol design, and could be populated during SDTM submission dataset creation.Date and Time Variables
    • Populate values directly in the tabulation dataset during dataset creation.
3Variables for date and time
  • The time an intervention started will be collected if there is a scientific or regulatory reason to collect this level of detail and the time can be realistically determined.
  • Collection variables for date 
CDASH date variables
  • (e.g., --DAT, --STDAT, --ENDAT)
are
  • will be concatenated with
CDASH
  • collection variables for time
variables
  • (e.g., --TIM, --STTIM, --ENTIM
, if time is applicable) into the appropriate SDTM --DTC variables
  • ) as applicable to populate tabulation variables for dates (e.g., --DTC, --STDTC,
 The CDASH variable
  • --ENDTC) using ISO 8601 format.
  • Collecting the time an intervention was started is only appropriate if it can be realistically determined and if there is a scientific reason for needing to know this level of detail. An example is where the subject is under the direct care of the site at the time the intervention was started, and the study design is such that it is important to know the intervention start time with respect to dosing.
  • 4
    -- REASND
    • --REASND is used with
    SDTM
    • tabulation variable --STAT
    only
    • . 
    • The value "NOT DONE" in --STAT indicates that the subject was not questioned about the intervention or that data were not collected; it does not mean that the subject had no interventions.
    5--SPID
    variable
    • --SPID may be populated by the
    sponsor
    • implementer's data collection system.
    • If collected, --SPID it can be
    beneficial to use
    • used as an identifier in a data query to communicate clearly to
    the site the specific
    • individuals involved in data collection the record in question.
    This field may be populated by the sponsor's data collection system.
    •  
    6Coding
    • When free-text
    intervention
    • interventions/treatments are recorded, the location may be included in the --TRT variable to facilitate coding (e.g.,
    liver
    • lung biopsy). Location may be collected when the
    sponsor
    • implementer needs to identify the specific anatomical location of the intervention. This location information does not need to be removed from the verbatim --TRT when creating
    SDTMIG submission
    • tabulation datasets.
    • The non-standard
    (or SUPPQUAL) variables 
    • variables --ATC1 through --ATC5 and --ATC1CD through --ATC5CD are used only when the intervention is coded using the World Health Organization's Anatomical Therapeutic Chemical (ATC) classification system (https://www.who.int/medicines/regulation/): 1 = the anatomical main group, 2 = the therapeutic main group, 3 = the therapeutic/pharmacological subgroup, 4 = chemical/therapeutic/pharmacological subgroup, 5 = chemical substance.
    The implementer
    • Implementers may
    also
    • add MedDRA coding elements as NSVs to the Interventions domain if
    this
    • that dictionary is used for coding.
    7Location (--LOC) and related variables (--LAT, --DIR, -- PORTOT)
    Because the complete lists of controlled terminology for these variables may be too extensive to be relevant for a particular study CRF, sponsors may choose to include only subsets of the
    • Applicants may collect location data using a subset list of controlled terminology on the CRF.
  • --LOC could be a defaulted or hidden field on the CRF for prespecified [--TRT]/Intervention Topic].
  • Relative Timing Variables (see the SDTMIG for more information and details)
    1. For each study, the sponsor defines the study reference period in the Demographics (DM) domain using SDTMIG variables RFSTDTC and RFENDTC. Other sponsor-specified reference time points can be defined for other data collection situations. The CDASH variables --PRIOR and --ONGO may be collected in lieu of start date or end date.
    2. The CDASH variable --PRIOR is used to indicate if the --TRT (the topic item) started prior to either the study reference period or another sponsor-defined reference time point. When the study reference period is used as the anchor, --PRIOR may be used to derive a value (from the Controlled Terminology codelist STENRF) into the SDTM relative timing variable --STRF. When populating --STRF, if the value of --PRIOR is "Y", the value of “BEFORE” may be mapped to --STRF. The value in DM.RFSTDTC serves as the anchor for --STRF. 
    3. When a reference time point is used instead of the study reference period, --PRIOR may be used to derive a value into the SDTM relative timing variable --STRTPT. If the value of --PRIOR is "Y", the value of "BEFORE" may be derived into --STRTPT. Note: --STRTPT must refer to the "time point anchor" as described in --STTPT. The value in --STTPT can be either text (e.g., "VISIT 1") or a date (in ISO 8601 format).
    4. The CDASH variable --ONGO is used to indicate if the value in --TRT is continuing beyond the study reference period or beyond another sponsor-defined reference time point. When the study reference period is used as the anchor, --ONGO may be used to derive a value into the SDTM relative timing variable --ENRF. If the value of --ONGO = "Y", the value of "AFTER" may be mapped to --ENRF. 
    5. When a reference time point is used instead of the study reference period, --ONGO may be used to derive a value into the SDTM relative timing variable --ENRTPT. If the value of --ONGO is "Y", the value of "ONGOING" may be mapped to --ENRTPT. Note: --ENRTPT must refer to the “time point anchor” as described in --ENTPT. The value in --ENTPT can be either text (e.g., "TRIAL EXIT") or a date (in ISO 8601 format).

    The following should be used when the observation class for fields in a domain are Events.

    • Applicants may prepopulate hidden variables with values assigned within their operational database.

    The following assumptions will be implemented for Events class domains. 

    --YN
    Metadataspec
    NumField or VariableGuidance
    1--YN
    • --YN ("Yes/No") questions are used to provide a definite answer. The absence of a response is ambiguous as it can mean "no," "none," or that the response is missing.
    Metadataspec
    NumField or VariableGuidance
    1
    • Variables with the question text "Were there any <events>?" (e.g., “Were there any adverse events?”
    , “Were there any healthcare encounters?”) are intended to assist in
    • ) support the cleaning of data and
    in confirming there are no missing values.
    • confirmation that entry of collected data is complete. 
    • These questions can be
    added to
    • used on any CRF
    in order to capture this information
    • .
    • Values collected for these fields will not be represented in subsequent tabulation datasets.
    2--CAT
    and/or
    , --SCAT
    Variables
    • Categories and subcategories are determined per protocol design and values are generally not entered
    on the
    • via CRF
    by sites
    • .
    • Implementers may
    prepopulate
    • :
      • Prepopulate and display
    these
      • category values to help
    site personnel
      • individuals involved in data collection understand what data should be recorded on the CRF.
    Implementers may also prepopulate
      • Prepopulate hidden variables with the values assigned within their operational database.
    Categories and subcategories are typically evident from the protocol design, and could be populated during SDTM
      • Populate values directly in the tabulation dataset during dataset creation.
    3
    Date and Time VariablesCDASH date variables
    Variables for date and time
    • The time of an event will be collected if there is a scientific or regulatory reason to collect this level of detail and the time can be realistically determined.
    • Collection variables for date (e.g., --DAT, --STDAT, --ENDAT)
    are
    • will be concatenated with
    CDASH
    • collection variables for time
    variables
    • (e.g., --TIM, --STTIM, --ENTIM
    , if time is collected) into the appropriate SDTM --DTC variables
    • ) as applicable to populate tabulation variables for dates (e.g., --DTC, --STDTC, --ENDTC) using ISO 8601 format.
  • Collecting the time of an event is only appropriate if it can be easily obtained and if there is a scientific reason, such as the need to know the order of events (e.g., the adverse event started after dosing). An example of this would be a study where the subject is confined to a phase 1 unit and under the direct care of the unit staff at the time that the event started or using time to tie together dosing and pharmacokinetic (PK) sample collection.
  • --COCCUR variable (see Section 3.4, How to Collect New Data Collection Fields When No CDASHIG Field Has Been Defined, category 3)
    4--OCCUR
    • --OCCUR may be used when a specific event is solicited (preprinted) on the CRF and the CRF uses
    a sponsor
    • an applicant-defined codelist.
    For example, a sponsor may combine the concepts of the CDASH OCCUR variable
    • --OCCUR = "N" indicates the pre-specified event did not occur.
    • --OCCUR may be implemented while also allowing for a "NOT DONE" response.
    Because the SDTM Controlled Terminology for --OCCUR only includes "N", "Y", and "UNKNOWN" responses, if the CDASH variable --OCCUR is used, the CRF would require a second question to indicate that the data were not collected. The CDASH variable --COCCUR is only used when events are prespecified.
    5--REASND
    • --REASND
     variable 
    • is used
    in conjunction
    • with
    SDTM
    • tabulation variable --STAT. 
    • The value "NOT DONE" in --STAT indicates that the subject was not questioned about the event or that data
    was
    • were not collected; it does not mean that the subject had no events. 
    6
    The CDASH
    --SPID
     variable
    • --SPID may be populated by the
    sponsor
    • applicant's data collection system.
    • If collected, --SPID it can be
    beneficial to use
    • used as an identifier in a data query to communicate clearly to
    the site the specific
    • individuals involved in data collection the record in question.
    This field may be populated by the sponsor's data collection system
    •  
    7Coding
    • The
    CDASH
    • collection variables used for coding are not
    data collection
    • data collection fields that will appear on the CRF
    itself. Sponsors will populate these
    • . Applicants will populate values through the coding process.
    • When free-text event terms are entered, the
    location may
    • location may be included
    in 
    • in --TERM to facilitate coding and further clarify the event. This location
    information does
    • information does not need to be removed from the verbatim term when creating
    SDTM submission
    • tabulation datasets.
    • The CDASH variables --LLT, --LLTCD, --PTCD, --HLT,
     
    • --HLTCD, --HLGT, --HLGTCD, --SOC, and --SOCCD are only
    applicable to
    • applicable to events coded in MedDRA.
    8Location (--LOC, --LAT, --DIR, --PORTOT)
    • Location is collected when the
    sponsor
    • applicant needs
    to identify
    • to identify the specific anatomical location of the event.
    • Implementers may collect
    the
    • location
    information using
    • data using a subset list of controlled terminology on the CRF.
    • Implementers may prepopulate hidden variables with values assigned within their operational database.

    The following assumptions will be implemented for Findings class domains. 

    Metadataspec
    NumField or VariableGuidance
    1--CAT, --SCAT
    • Categories and subcategories are determined per protocol design and values are generally not entered via CRF.
    • Implementers may:
      • Prepopulate and display category values to help individuals involved in data collection understand what data should be recorded on the CRF.
      • Prepopulate hidden variables with the values assigned within their operational database.
      • Populate values directly in the tabulation dataset during dataset creation.
    2--PERF, --STAT, --REASND
    • --PERF defines - variables to record whether an assessment has been performed/collected. --REASND is used to collect a reason why an assessment was not done.
    • --PERF has the Question Text "[Were any/Was the] [--TEST/ topic] [measurement(s)/test(s) /examinations (s)/specimen(s) /sample(s) ] [performed/collected]?" are intended to assist in the cleaning of data and in confirming that entry of collected data is complete.
    • --PERF may be used at the page, panel, or question level.
    • --PERF may be used during the creation of tabulaton datasets to derive a value into the SDTM variable --STAT. The implementer can use a combination of --CAT, --SCAT, with the --TESTCD= "--ALL" and --TEST= "<Name of the CRF
    . Location variables can be prepopulated as needed. There is currently some overlap across the LOC, LAT, and DIR variables for controlled terminology. While the overlap exists, ensure that this overlap for these variables is not part of database design. 
    • module>" to represent what tests were not performed.
    • Implementers must decide how to model each test not performed (e.g., to denote that all tests were not performed using TESTCD = "–ALL").
    • --STAT has the Question Text "Was the [--TEST] not [completed/answered/done/assessed/evaluated]?; Indicate if (the [--TEST] was) not [answered/assessed/done/evaluated/performed]." This is intended to be used to collect a simple "NOT DONE" check box at the page, panel, or question level.
    • --REASND is used with SDTM variable --STAT only. The value NOT DONE in --STAT indicates that a question was not asked or a test was not done, or a test was attempted but did not generate a result.
    3--SPID
    • --SPID may be populated by the applicant's data collection system. If collected, it can be beneficial to use an identifier in a data query to communicate clearly to the site the specific record in question.
    • This field may be populated by the applicant's data collection system.
    4Variables for date and time
    • Time will be collected if there is a scientific or regulatory reason to collect this level of detail and the time can be realistically determined. 
      • Metadata tables generally include --DAT and --TIM will be added from the CDASH Model as appropriate.     
    • Collection variables for date and time (e.g., --DAT, --TIM) will be used to collect the date or date and time that the test was performed, or the specimen was collected. The start and end dates and times (e.g., for specimen collection) will be collected as appropriate.
    • The date of collection of a test can be derived from the date of visit. In such cases, a separate date of observation field is not required to be present on the CRF.
    • Date and time variables will not be used to collect dates that are the result of a tests. Test results will be collected using --ORRES.
    5

    Horizontal (denormalized) and vertical data structures (normalized)

    Jira
    showSummaryfalse
    serverIssue Tracker (JIRA)
    serverId85506ce4-3cb3-3d91-85ee-f633aaaf4a45
    keyTOBA-402

    • In metadata specifications, many of the Findings class domains are presented in a normalized structure (1 record for each test) similar to a tabulation dataset, even though many data management systems hold the data in a denormalized structure (1 variable for each test).
    • When implementing collection standards in a denormalized structure, create variable names for the Findings --TEST and/or --TESTCD values. To do this:
      • Define the denormalized variable names using available CDISC Controlled Terminology for --TESTCD; or
      • When a system allows more than 8-character variable names, the value of variable --TESTCD can be concatenated with the tabulation variable name separated by an underscore (e.g., DIABP_VSORRES, DIABP_VSLOC). 
    • In the horizontal (denormalized) setting, collection variables such as --PERF, --LOC , and --STAT can be collected once for the whole horizontal record and applied to all of the observations on that record, or collected per test using collection variables, such as <--TESTCD>_--PERF. When tabulation datasets are created, any variables collected for the entire horizontal record will be mapped to each vertical record per tabulation guidance.
    • In the horizontal (denormalized) setting, an identifier can be used to identify all --TESTCD for the same collection record. This supports mapping of data collected in a horizontal setting to tabulation datasets and creation of RELRECs.
    6Tests and original results
    • The value in --TEST will be 40 characters or less.
    • The corresponding codelist value for the short test name, 8 characters or less, will be populated in the tabulation variable --TESTCD.
    • Variable --TESTCD should be used to create a variable name and --TEST be used as the Prompt on the CRF.
    • Both --TESTCD and --TEST are recommended for use in the operational database.
    • Variable --ORRES is used to collect test results or findings in the original units as received or collected in character format. 
    • If results are modified for coding, the --MODIFY variable contains the modified text.
    • Variables --ORNRLO and --ORNRHI and --NRIND are used when normal or reference ranges are collected for results. 
    • Standardization of the original results and/or normal/reference ranges will be performed during the creation of tabulation datasets.   
    7Location variables (--LOC, --LAT, --DIR, --PORTOT)
    • Location variables are used to collect the location of the test.
    • Applicants may collect location data using a subset list of controlled terminology on the CRF.
    • Applicants may prepopulate hidden variables with values assigned within their operational database. 
    8–ORRES, --RES, --DESC, and --RESOTH
    • Variables --ORRES, --RES, --DESC, and --RESOTH are used to collect results. It is recommended that: 
      • --ORRES is used when the result is collected using a single question. The result will map directly to the tabulation variable --ORRES.
      • --RES and --DESC are used when a pair of questions are asked to collect the result; a question to collect the result with a follow-up question for a description of the result. For example, the question “Is the <condition> [absent/present]?" with a follow-up question “What is the finding that was observed?" where --RES is used to collect whether the finding is normal/abnormal or absent/present and --DESC is used to collect the description of the finding. 
      • --RES and --RESOTH are used when a question is asked that allows the selection of a prespecified finding, with a follow-up question to ask about the pre-specified response "OTHER". For example, the question "What is the result?" with a set of prespecified responses, including the choice “OTHER” with the follow-up question “Specify, Other”.
    9Root variables
    • The Findings About Events and Intervention domains use the same root variables as the Findings domain, with the addition of the --OBJ variable.

    Pagenav