Disclaimer
The views and opinions expressed in this blog entry are those of mine and do not reflect the official policy or position of CDISC.
In this blog, I want to highlight one part of a project deliverable from the Controlled Terminology (CT) Relationships subteam - metadata about CDISC CT for the SDTM TS dataset.
Before going into detail, here is a bit about how this Standards Development team was established. The team began at the CDISC Working Group Meeting in 2017 at Silver Spring, Maryland, U.S.A. NCI EVS representatives raised maintenance issues that stemmed from drastically different publication cadence between CDISC CT and Implementation Guide. Volunteers also shared implementation challenges about CDISC CT. After much discussion, the attendees agreed to this general problem statement for a new development subteam to tackle:
Relationships between published terminology codelists and variable metadata are not explicit enough or are incomplete in published Implementation Guides (IG) or Therapeutic Area User Guides (TAUG).
Fast forward to today, the team recently finished reviewing all the SDTM v1.4 & SDTMIG v3.2 domain variables. A project deliverable is being compiled with two main components:
Of all the SDTM datasets reviewed, I find Trial Summary (TS) the most intriguing due to its complex CT requirements.
The SDTM TS dataset, by definition, is "a trial design domain that contains one record for each trial summary characteristic." [1] A trial summary characteristic is represented by two parts: 1) TSPARM/TSPARMCD pair, or parameter/parameter code, respectively; and, 2) TSVAL, or value. Permissible values for TSVAL are dependent on TSPARM/TSPARMCD. In other words, CT requirement for TSVAL is dependent on TSPARM/TSPARM for any given dataset record.
Here is an excerpt from the SDTMIG v3.2's Appendix C1:
# | TSPARMCD | TSPARM | TSVAL (Codelist Name or Format) |
---|---|---|---|
1 | ADDON | Added on to Existing Treatments | No Yes Response |
2 | TDIGRP | Diagnosis Group | SNOMED CT |
3 | PCLAS | Pharmacological Class of Inv Therapy | NDF-RT |
4 | TRT | Investigational Therapy or Treatment | UNII |
Let's inspect and discuss each of them.
For #1, although seasoned CDISC users would likely recognize "No Yes Response" as one of the CDISC CT codelists, this notation inadvertently puts naive users at disadvantage. Even to trained users, it does not mean all the terms within that CT codelist are permissible. From a process automation's perspective, it contains no information to a machine about its purpose. Therefore, it isn't ideal for either human-, or machine-readability.
About #2 and #3, SNOMED CT and NDF-RT are external dictionaries. NDF-RT has been renamed to MED-RT. Not all users recognize these external dictionaries, especially when usages could be specific to certain geographical regions. Also, users face this implementation challenge: which component of these external dictionaries do they use to populate TSVAL? Therefore, information published in this SDTMIG appendix is not contemporary and is not explicit.
UNII is a coded identifier for all registered ingredients used in products regulated by US FDA. For example, 362O9ITL9D is the UNII for acetaminophen. In #4, it is misleading to populate TSVAL with UNII. It is more appropriate to populate this coded value in TSVALCD (parameter value code). The decode, so to speak, would instead go to TSVAL. In this instance, TSVAL shall correspond to the preferred substance name, a component in the Global Substance Registration System, which is maintained by U.S. FDA.
What extra information is needed to make example #1 more readable to both human and machines? Since it is about CDISC CT, common attributes, such as codelist names (short & long) and c-codes will immediately be helpful. An attribute for subsetting codelist will be necessary to specify permissible values.
About the external dictionaries in examples #2 through #4, extra information to describe them will be elucidating, such as 1) owning organization, 2) dictionary's name, and, 3) dictionary's component.
An extra bit of metadata will be essential to cope with multiple regulatory requirements for SDTM data submissions.
All of the above together formulates the model (or, structure) for complete disambiguation of the relationships between CDISC CT and SDTM variables. The following tables illustrate this model in a tabular manner, along with the example parameters:
For use when CDISC CT is relevant:
# | Usages | Domain | Variable | Condition 1 | C-Code for Value in Condition 1 | Condition 2 | C-Code for Value in Condition 2 | CDISC CT Codelist Short Name | CDISC CT Codelist C-Code | CDISC CT Codelist Long Name | Permissible Value from CDISC CT | Permissible Value's C-Code | Health Authority Provisions |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Context of which this row of metadata applies; valid values are versioned foundational standards | A domain abbreviation found in foundational standard in "Usages" | A variable name | May be used for normalized datasets such as SuppQual, Findings domains, and TS ** Use this for TESTCD and PARMCD; or, QNAM | Conditional Value's c-code in the Condition column, if applicable | May be used for normalized datasets such as SuppQual, Findings domains, and TS ** Use this for TEST and PARM to pair with TESTCD and PARMCD; otherwise, not needed | Conditional Value's c-code in the Condition column, if applicable | The CDISC CT Codelist that controls the values referenced in "Domain" and "Variable" columns | C-code that pairs with "CDISC CT Codelist Short Name" | Long name that pairs with "CDISC CT Codelist Short Name" | A semi-colon delimited value list subset from the codelist referenced in "CDISC CT Codelist Short Name" | C-codes for each value in "Permissible Value from CDISC CT", also semi-colon delimited | Specify to which health authority this set of metadata is applicable. Leave blank when not applicable. Example: "US FDA", "Japan PMDA" | |
1 | SDTMIG v3.2 | TS | TSVAL | TSPARMCD EQ "ADDON" | C49703 | TSPARM EQ "Added on to Existing Treatments" | C49703 | NY | C66742 | No Yes Response | N; Y | C49488; C49487 |
For use when external dictionary is relevant:
# | Usages | Domain | Variable | Condition 1 | C-Code for Value in Condition 1 | Condition 2 | C-Code for Value in Condition 2 | External Dictionary's Organization | External Dictionary's Name | External Dictionary's Component | Descriptive Information | Health Authority Provisions |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Context of which this row of metadata applies; valid values are versioned foundational standards | A domain abbreviation found in foundational standard in "Usages" | A variable name | May be used for normalized datasets such as SuppQual, Findings domains, and TS ** Use this for TESTCD and PARMCD; or, QNAM | Conditional Value's c-code in the Condition column, if applicable | May be used for normalized datasets such as SuppQual, Findings domains, and TS ** Use this for TEST and PARM to pair with TESTCD and PARMCD; otherwise, not needed | Conditional Value's c-code in the Condition column, if applicable | Used when "Variable" is controlled by an external dictionary. Example: "MSSO", "Regenstrief Institute" | Used when "Variable" is controlled by an external dictionary. Example: "MedDRA", "LOINC" | Used when "Variable" is controlled by an external dictionary. Example: "Preferred Term Code", "LOINC Code" | Additional information that is useful for implementers from a citable source ** Citable implementation information that can't be molded into detail metadata; or, regulatory agency's requirements | Specify to which health authority this set of metadata is applicable. Leave blank when not applicable. Example: "US FDA", "Japan PMDA" | |
2 | SDTMIG v3.2 | TS | TSVAL | TSPARMCD EQ "TDIGRP" | C49650 | TSPARM EQ "Diagnosis Group" | C49650 | International Health Terminology Standards Organisation (IHTSDO) | SNOMED CT | SNOMED CT Fully Specified Name | Appendix C of SDTMIG v3.2 specifies SNOMED CT. See FDA TCG section 6.6.1.1 Also see Notes in Appendix C of SDTMIG v3.2: If the study population is healthy subjects (i.e., healthy subjects flag is Y), this parameter is not expected. | US FDA |
2 | SDTMIG v3.2 | TS | TSVALCD | TSPARMCD EQ "TDIGRP" | C49650 | TSPARM EQ "Diagnosis Group" | C49650 | International Health Terminology Standards Organisation (IHTSDO) | SNOMED CT | SNOMED CT Identifier (SCTID) | US FDA | |
3 | SDTMIG v3.2 | TS | TSVAL | TSPARMCD EQ "PCLAS" | C98768 | TSPARM EQ "Pharmacologic Class" | C98768 | Department of Veterans Affairs/Veterans Health Administration | Medication Reference Terminology (MED-RT) | Established pharmacologic class (EPC) | Note: Refer to citation in FDA TCG guidance. If the established pharmacologic class (EPC) is not available for an active moiety, then the sponsor should discuss the appropriate MOA, PE, and CS terms with the review division. | US FDA; Japan PMDA |
3 | SDTMIG v3.2 | TS | TSVALCD | TSPARMCD EQ "PCLAS" | C98768 | TSPARM EQ "Pharmacologic Class" | C98768 | Department of Veterans Affairs/Veterans Health Administration | Medication Reference Terminology (MED-RT) | Alphanumeric unique identifier (NUI) | US FDA; Japan PMDA | |
4 | SDTMIG v3.2 | TS | TSVAL | TSPARMCD EQ "TRT" | C41161 | TSPARM EQ "Investigational Therapy or Treatment" | C41161 | U.S. Food and Drug Administration (US FDA) | Global Substance Registration System | Preferred substance name | US FDA; Japan PMDA | |
4 | SDTMIG v3.2 | TS | TSVALCD | TSPARMCD EQ "TRT" | C41161 | TSPARM EQ "Investigational Therapy or Treatment" | C41161 | U.S. Food and Drug Administration (US FDA) | Global Substance Registration System | Unique Ingredient Identifier (UNII) | US FDA; Japan PMDA |
The project deliverable is currently undergoing Internal Review per CDISC's standard development process. [2] All artifacts created by the team are available on the CDISC Wiki, along with a Read Me section. [3] The team expects Public Review to begin in 3rd quarter of 2020.
The team operates with a tight alignment with CDISC's strategic goal to transform standards and clinical knowledge into a multidimensional representation to support automation. [4] Users can expect the metadata will be accessible via CDISC Library when it completes the development lifecycle. Future IG and TAUG may reference CT Relationships to keep concurrent with CDISC CT publication cadence. The team may incorporate additional kinds of CT relationships metadata, e.g., CT codetables. [5] Also an aspiration, the team, using the same methodology, will expand to cover CDASH, SEND, and ADaM.
I want to acknowledge these people for their contributions and domain expertise: Kristin Kelly (Pinnacle 21), Michael Lozano (Eli Lilly), Sharon Weller (Eli Lilly), Donna Sattler (BMS), Debbie O’Neill (Merck), Smitha Karra* (Gilead), Judith Goud (Nurocor), Swarupa Sudini (Pfizer), Anna Pron-Zwick (AstraZeneca), Craig Zwickl (Independent), Erin Muhlbradt* (NCI EVS), Fred Wood (TalentMine), Trish Gleason (BMS), Sharon Hartpence (BMS), Diane Wold (CDISC). Special thanks to Ann White for copyediting.
* denotes team co-lead, current and past
[1] CDISC SDTM CT P34. Extracted from CDISC Library Data Standards Browser: https://library.cdisc.org/browser/ct/2018-06-29?products=sdtmct-2018-06-29&codelists=C66734&codevalue=C53483
[2] CDISC Operating Procedure CDISC -COP -001 Standards Development. https://www.cdisc.org/system/files/about/cop/CDISC-COP-001-Standards_Development_2019.pdf
[3] Internal Review package. https://wiki.cdisc.org/display/CT/Internal+Review
[4] CDISC Strategic Plan 2019-2022. https://www.cdisc.org/sites/default/files/resource/CDISC_2019_2022_Strategic_Plan.pdf
[5] CT codetables. https://www.cdisc.org/standards/terminology, expand Codetable Mapping Files
2 Comments
Sam Hume
We're adding improved semantic references into ODM v2, including the new coding element. We can use this to better reference external terms and dictionaries, as well as more formal references to CDISC CT. Here's an example:
Kaja Najumudeen MS
It is so glad to see CT relationship sub team is structuring out to put this much needed row level metadata information in place to enable both machine readable and adding more meaning for human readable. My appreciation to each and every one of the team members in contributing to this effort.
But I also wanted to point out that this is taken care through CDISC 360 project initiative through enriched 360 metadata. I believe it may be known to many but Interested people can take a look at the project page. I think it is the same what Sam is trying to say here if I am not wrong.