Using controlled terminology and formatting from data collection through tabulation and analysis builds in traceability and transparency across study data. Controlled terminology and formats are required to be used as specified in this guide. General expectations for use of controlled terminology and formats are provided below with more specific expectations in collection metadata tables, tabulation domain specifications, and analysis specifications. TIG metadata and domain specifications refer to controlled terminology and formats defined both within and external to this guide.
Controlled Terminology
CDISC controlled terminology may be referenced here:
Some controlled terminology codelists are extensible. This means that values that are not already represented in that list (either as a CDISC Submission Value, a synonym, or an NCI preferred term) may be added as needed. Other codelists, are non-extensible and must be used without adding any terms to the list. The extenisbility Where no CDISC Controlled Terminology exists, implementers should develop sponsor-defined terminology to ensure consistency and transparency. If gaps are identified, sponsors should submit requests to add values to CDISC CT by using the Term Suggestion form (available at http://ncitermform.nci.nih.gov/ncitermform/?version=cdisc).
Controlled Terminology for Data Collection
Controlled terminology can be used in the following ways during data collection:
- To collect data using controlled terms (e.g., Mild, Moderate, Severe)
- When appropriate a subset of a terms may be used rather than all available terms.
- To ask a specific question on the CRF (e.g., Temperature)
- To create a variable name in the database (e.g., TEMP for the collection of vital sign data when a unique variable name must be created for each vital sign result)
In tabulation datasets,
The following is appliable when implementing controlled terminology in tabulation datasets.
- Variables with controlled terminology should match the case in the controlled terminology list When extending a controlled terminology list, follow the case convention of that list.
- Controlled terminology should be submitted in the same text case used in the controlled terminology list. When extending a controlled terminology list, the case-sensitivity convention of that list should be followed.
Storing topic variables for general domain models
The topic variable for the Interventions and Events general observation-class models is often stored as verbatim text. For an Events domain, the topic variable is --TERM. For an Interventions domain, the topic variable is --TRT. For a Findings domain, the topic variable --TESTCD should use controlled terminology (e.g., "SYSBP" for systolic blood pressure). If CDISC Controlled Terminology exists, it should be used; otherwise, sponsors should define their own controlled list of terms. If the verbatim topic variable in an Interventions or Event domain is modified to facilitate coding, the modified text is stored in --MODIFY. In most cases—other than Physical Examination (PE)—the dictionary-coded text is derived into --DECOD. Because the PEORRES variable is modified instead of the topic variable for PE, the dictionary-derived text would be placed in PESTRESC. The variables used in each of the defined domains are:
Domain | Original Verbatim | Modified Verbatim | Standardized Value |
---|---|---|---|
AE | AETERM | AEMODIFY | AEDECOD |
DS | DSTERM | DSDECOD | |
CM | CMTRT | CMMODIFY | CMDECOD |
MH | MHTERM | MHMODIFY | MHDECOD |
PE | PEORRES | PEMODIFY | PESTRESC |
Many terms are synonyms of other terms. When there are multiple terms that express the same base concept, SEND controlled terminology provides the preferred term to include in a submission, and thus the term to which the synonymous term(s) should be mapped. The NCI Thesaurus (https://ncit.nci.nih.gov) provides the synonyms. For instance, the unit of degrees Celsius could be expressed as "°C", "degC", "C", "Degrees Celsius", and so on. The SEND preferred term for degrees Celsius is "C." Temperature can also be expressed in terms of degrees Fahrenheit, but this is a different concept from degrees Celsius. The key to mapping is determining which terms are synonymous, not which terms can be converted into one another via a conversion factor (for conversion, see Section 4.5.1.4, Example of Original and Standardized Results and Test Not Done).
Finding the submission value for a source value can be done in 2 ways. First, searching the controlled terminology list can determine whether the source value is in the list. If it is not, the easiest way to search for synonyms is the NCI Thesaurus. The NCI Thesaurus's search functionality searches terms and synonyms and provides the SEND submission value (preferred term).
This example illustrates mapping source units into their controlled terminology preferred term for --ORRESU. Note that in each case, there is only a label change (no conversion calculation).
Row 1: | The source unit was "Celsius". This unit maps to the submission value of "C". |
---|---|
Row 2: | The source unit was "microgram per liter". This unit maps to the submission value of "ug/L". |
Row 3: | The source unit was "ng/mL". This unit is a scientifically equivalent unit (i.e., no conversion calculation necessary) to the SEND submission value of "ug/L". |
Row | Source Unit | Submission Value (--ORRESU) |
---|---|---|
1 | Celsius | C |
2 | microgram per liter | ug/L |
3 | ng/mL | ug/L |
Storing controlled terminology lists for synonym qualifiers
- For events such as adverse events and medical history, populate --DECOD with the dictionary's preferred term and populate --BODSYS with the preferred body system name. If a dictionary is multi-axial, the value in --BODSYS should represent the system organ class (SOC) used for the sponsor's analysis and summary tables, which may not necessarily be the primary SOC. Populate --SOC with the dictionary-derived primary SOC. In cases where the primary SOC was used for analysis, --BODSYS and --SOC are the same.
- If MedDRA is used to code events, the intermediate levels in the MedDRA hierarchy should also be represented in the dataset. A pair of variables has been defined for each of the levels of the hierarchy other than SOC and Preferred Term (PT): one to represent the text description and the other to represent the code value associated with it. For example, --LLT should be used to represent the Lowest Level Term text description and --LLTCD should be used to represent the Lowest Level Term code value.
- For concomitant medications, populate CMDECOD with the drug's generic name and populate CMCLAS with the drug class used for the sponsor's analysis and summary tables. If coding to multiple classes, follow Section 4.2.8.1, Multiple Values for an Intervention or Event Topic Variable, or omit CMCLAS.
- For concomitant medications, supplemental qualifiers may be used to represent additional coding dictionary information (e.g., a drug's ATC codes from the WHO Drug Dictionary; see Section 8.4, Relating Non-standard Variable Values to a Parent Domain).
CREATE TABLE?
Standards for tabulation require representation of dates and/or times, intervals of time, and durations of time in ISO 8601 format as defined by the International Organization for Standardization (ISO) (http://www.iso.org). Included in specifications...