Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Standards for collection, tabulation, and analysis areAll models implemented as part of this guide collect and represent data by common topics with:

  • CDASH and SDTM grouping logically related data points in data in domains; and 
  • ADaM dataset design customizable for analysis requirements.

The terms “domain” and “dataset” are commonly used in CDISC’s nomenclature and found frequently in the Study Data Tabulation Model (SDTM). For example, the SDTM v1.8 includes 134 instances of "domain" and says "A collection of observations on a particular topic is considered a domain." The Model includes 78 instances of dataset and certain structures in the model are called "datasets" rather than "domains." Is there a difference between a domain and a dataset?

The CDISC Glossary defines these terms as follows:

  • Domain: A collection of logically related observations with a common, specific topic that are normally collected for all subjects in a clinical investigation. NOTE: The logic of the relationship may pertain to the scientific subject matter of the data or to its role in the trial. Example domains include laboratory test results (LB), adverse events (AE), concomitant medications (CM). [After SDTM Implementation Guide version 3.2, CDISC.org] See also general observation class.
  • Dataset: A collection of structured data in a single file. [CDISC, ODM, and SDS] Compare to analysis dataset, tabulation dataset.

In plainer terms, a domain is a grouping of observations that are related while a dataset is the data structure associated with that grouping of observations. Both domains and datasets use the same nomenclature, which is why they are often confused.

The distinction between domain and dataset is most clearly seen in cases where a general observation class domain is split into multiple datasets in a submission. Common examples are splitting the Laboratory Test Results (LB) domain due to size, splitting the Questionnaires (QS) domain by questionnaire, and splitting the Findings About Events or Interventions (FA) domain by parent domain.

However, since in most cases there is a one-to-one relationships between a conceptual domain and a dataset based on that conceptual domain, the words are used interchangeably in the standards and, therefore, by most users. The structures called “relationship datasets” were given that name because they are mechanisms for connecting information represented in different datasets rather than observations about study subjects. Note that none of the relationship datasets includes the variable DOMAIN. However, in a submission, these datasets need dataset names, and character strings used in those names are included in the CDISC Codelist called "SDTM Domain Abbreviations."

...

  • per analysis requirements.




Domains


SDTM

Observations about study subjects are normally collected for all subjects in a series of domains. A domain is defined as a collection of logically related observations with a common topic. The logic of the relationship may pertain to the scientific subject matter of the data or to its role in the trial. Each domain is represented by a single dataset.

...