This example shows findings from an assessment of genes of interest from an oncology study with the purpose of determining the variation in short sequences of nucleotides in those genes. Short variations are generally defined as insertions or deletions of fifty base pairs or less when compared to a reference sequence. In this example an insertion was found for the BAP1 gene and a deletion was found for the CYP2D6 gene. The identifier for the genome reference used to generate the reported result is shown in GFGENREF. The designation (name or number) of the chromosome or contig on which the variant appears is shown in GFCHROM. The published symbol for the gene of interest is shown in GFSYM. The description of the type of genomic entity that is represented by the published symbol in GFSYM is shown in GFSYMTYP as GENE WITH PROTEIN PRODUCT. The location within a sequence for the observed value in GFORRES is shown in GFGENLOC. The filename and/or path to external data not stored in the same format and possibly not the same location as the other data for a study is shown in GFFXN.
Feedback:
Variants can be classified based on different criteria (e.g., ClinVar, CiViC), and different vendors may base their classification on different criteria. Is it possible to include this in the example modeling? Or include some language in the leading text on how best to capture this?
ClinVar is a clinical variant database
Questions:
We we talking about an identifier for a variant itself that is in a public database → GFPVRID?
We talking about criteria related to reportable VARIANT IMPACT CLASSIFICATION; i.e., the criteria related to determining the VARIANT IMPACT CLASSIFICATION result; rows 1 and 6?
Look at use of ANMETH in transcription example
Next steps:
Are these data collected for submission? If so, how are these data used as supporting data (i.e., in a listing) and/or subsequent statistical analyses? Or is this an exploratory and/or hypothetical question?
Rows 1, 6:
Show the variant impact classification, a categorization of the phenotypic or clinical significance of the genomic variation, of the BAP1 and CYP2D6 genes respectively.
Rows 2, 7:
Show the predicted amino acid change based on sequence analysis of the genomic variation in the BAP1 and CYP2D6 genes respectively.
Rows 3, 8:
Show the predicted coding sequence change based on sequence analysis of the genomic variation in the BAP1 and CYP2D6 genes respectively.
Rows 4, 9:
Show the read depth, the total number of times the genes loci were sequenced, for BAP1 and CYP2D6 respectively.
Rows 5, 10:
Show the variant read depth/read depth, the relative measurement of the variant read depth to total read depth of the BAP1 and CYP2D6 genes loci respectively, as percentages.