Page History

Requirements for data submission are defined and managed by the regulatory authorities to whom data are submitted. This section describes general requirements for datasets that may be part of a submission.

Jira

showSummary	false
server	Issue Tracker (JIRA)
serverId	85506ce4-3cb3-3d91-85ee-f633aaaf4a45
key	TOBA-608

However However, additional conventions may be defined by regulatory bodies or negotiated with regulatory reviewers. In such cases, additional requirements must be followed.

...

Observations about tobacco products and study subjects generated to support a submission are represented in a series of datasets aligned with logical groupings of data into domains.

Jira

showSummary	false
server	Issue Tracker (JIRA)
serverId	85506ce4-3cb3-3d91-85ee-f633aaaf4a45
key	TOBA-381

Domains described in this guide are generally aligned with implementation of a single dataset file in which to represent data in scope for a domain.All datasets are structured as flat files with rows representing observations and columns representing variables.In some cases, a dataset implemented for a domain may be split into physically separate dataset files to support submission when needed and as allowable by the regulatory authority.

...

serverId

Applicants will consider the nature of the data and apply reasonable, appropriate lengths to variables. For example:Issue Tracker (JIRA)85506ce4

-

3cb3

-

3d91-85ee-f633aaaf4a45

--TESTCD and IDVAR values will never be longer than 8 characters, so the lengths of those variables can be set to 8.

The length for variables that use controlled terminology can be set to the length of the longest term.

TOBA-387are represented (or would be represented).

To ensure split datasets can be appended back into 1 domain dataset:

The value of DOMAIN must be consistent across the separate datasets as it would have been if they had not been split (e.g., LB, FA).
All variables that require a domain prefix (e.g., --TESTCD, --LOC) must use the value of DOMAIN as the prefix value (e.g., LB, FA).
--SEQ must be unique within USUBJID for all records across all the split datasets. If there are 1000 records for a USUBJID across the separate datasets, all 1000 records need unique values for --SEQ.
When relationship datasets (e.g., SUPPxx, FAxx, CO, RELREC) relate back to split parent domains, the value of IDVAR will be from a variable with unique values for each observation.

server

Permissible variables included in one split dataset need not be included in all split datasets.

For domains with 2-letter domain codes, split dataset names can be up to 4 characters in length. For example, if splitting by --CAT, dataset names would be the domain name plus up to 2 additional characters to indicate the value of --CAT

Issue Tracker (JIRA) (e.g., LBHM for LB if the value of --CAT is HEMATOLOGY). If splitting Findings About by parent domain, then the dataset name would be the domain code, "FA", plus the two-character domain code for parent domain code (e.g., "FACM"). The 4-character dataset-name limitation allows the use of a Supplemental Qualifier dataset associated with the split dataset.

Supplemental Qualifier datasets for split domains will also be split. The nomenclature will include the additional 1 to 2 characters used to identify the split dataset (e.g., SUPPLBHM, SUPPFACM). The value of RDOMAIN in the SUPP-- datasets would be the 2-character domain code (e.g., LB, FA).

In RELREC, if a dataset-level relationship is defined for a split Findings About domain, then RDOMAIN will contain the 4-character dataset name, rather than the domain name "FA" (e.g., the value of RDOMAIN will be FACM).

TOBA-605

Metadataspec

key

Num

Guidance For

Implementation

1

Dataset content

Data represented in datasets will include the following per regulatory requirements, scientific needs, and standards in this guide:

Data as originally collected or received (using controlled terminology where applicable) to support the submission
Data from external references relevant to the submission (e.g., study protocol)
Data assigned per conventions in the TIG
Data derived per regulatory and TIG conventions

2

Dataset naming

Domain datasets based on the SDTM general observations classes will be named using the 2-character code for the domain or using the applicable 4-character code when a dataset is split (e.g., LB, LBHM).
Supplemental Qualifier
Jira
showSummary false
server Issue Tracker (JIRA)
serverId 85506ce4-3cb3-3d91-85ee-f633aaaf4a45
key TOBA-577792
Supplemental Qualifier datasets will be named using convention "SUPP" concatenated withthe 2-character domain code for the parent domain
Jira
showSummary false
server Issue Tracker (JIRA)
serverId 85506ce4-3cb3-3d91-85ee-f633aaaf4a45
key TOBA-382
or (e.g., SUPPDM, SUPPFA) or the 4-character code for the parent dataset when a dataset is split (e.g., SUPPDM, SUPPFA, SUPPFACM).
All other datasets will be named using the code for the domain or dataset (e.g., DM, RELREC).

3

Variable order

Dataset variables will be ordered per guidance in the SDTM.
Variable order in TIG domain specifications aligns with variable order in the SDTM.

4

Variable names

Variables will be named per guidance in the SDTM. The SDTM guidance uses fragment names in the CDISC Non-Standard Variables Registry.
Variable names in TIG domain specifications align
Jira
showSummary false
server Issue Tracker (JIRA)
serverId 85506ce4-3cb3-3d91-85ee-f633aaaf4a45
key TOBA-383
with naming conventions in the SDTM.
Variable names will be 8 characters or less and uppercase.

5

Variable labels

Descriptive labels per this guide, up to 40 characters, will be provided as data variable labels for all variables, including Supplemental Qualifier variables.

6

Variable length

Jira

showSummary

false

server

Issue Tracker (JIRA)

serverId

85506ce4-3cb3-3d91-85ee-f633aaaf4a45

TOBA-701

Jira

showSummary	false
server	Issue Tracker (JIRA)
serverId	85506ce4-3cb3-3d91-85ee-f633aaaf4a45
key	TOBA-607

When variable length is referenced in the TIG, this refers to the length in bytes of ASCII character strings.

The maximum length of character variables is 200 characters, and the full 200 characters should not be used unless necessary.

Jira

showSummary

false

server

Issue Tracker (JIRA)

85506ce4-3cb3-3d91-85ee-f633aaaf4a45

key

TOBA-384

Jira

showSummary

false

server

serverId

key

TOBA-385

Jira

showSummary	false
server	Issue Tracker (JIRA)
serverId	85506ce4-3cb3-3d91-85ee-f633aaaf4a45
key	TOBA-386

7

Variable value text case

Values from controlled terminology or response values for QRS instruments specified by the instrument documentation will be in the case specified by those sources.
Otherwise, text data will be represented in upper case (e.g., NEGATIVE).

8

Missing variable values

Missing values for individual data items will be represented by nulls.

9

Splitting datasets

Jira

showSummary	false
server	Issue Tracker (JIRA)
serverId	85506ce4-3cb3-3d91-85ee-f633aaaf4a45
key	TOBA-606

A domain dataset may be split into physically separate datasets to support submission when needed and as allowable by the regulatory authority. The following conventions must be adhered to when splitting domains into separate datasets:

A domain based on a General Observation Class may be split according to values in variable --CAT. When a domain is split on --CAT, --CAT must not be null.
The Findings About Events or Interventions (FA) domain may be split according to values in variable the domain in which the interventions or events in --OBJ .

Jira

showSummary

false

server

Issue Tracker (JIRA)

serverId

85506ce4-3cb3-3d91-85ee-f633aaaf4a45

key

Jira

showSummary

false

Issue Tracker (JIRA)

serverId

85506ce4-3cb3-3d91-85ee-f633aaaf4a45

key

TOBA-620

Jira

showSummary

false

server

serverId

85506ce4-3cb3-3d91-85ee-f633aaaf4a45

key

TOBA-388

Jira

showSummary

false

server

Issue Tracker (JIRA)

serverId

85506ce4-3cb3-3d91-85ee-f633aaaf4a45

key

Page tree

Versions Compared

Old Version 146

New Version Current

Key

Analysis Datasets