...

Status

title	DRAFT

...

Sam Hume

...

Jozef Aerts Sam Hume

...

Introduction

JSON representations for exchange standards are widely used in today’s architectures. In RESTful web services, JSON is often the preferred format for the service response, due to its compactness and ease of use in mobile applications. Other standards used in healthcare, such as HL7-FHIR, support JSON as well as XML, together with other formats such as RDF.

JSON and XML are however not 1:1 interoperable, as they are based on different principles. For examplesexample, JSON does not have a native mechanism for namespaces (as it wants to remain "lightweight"). Also JSON does not have an equivalent for XML "text content". In JSON, "text content" is treated in the same way as "attribute pairs" of XML.

In order to make XML and JSON representations of the same standard interconvertible without any information loss, one need to develop a set of principles, usually named "conversion conventions" . A number of them have been developed and are publicly available. See e.g. See http://wiki.open311.org/JSON_and_XML_Conversion/Starting from ODM version 2 (ODMv2) a JSON representation for ODM is available.

This document explains the principles of the JSON representation, and the conventions used. These are based on the "Flickr conventions" for JSON (https://www.flickr.com/services/api/response.json.html).

Note: The Data Exchange Standards team is still considering the development of a native JSON schema so that JSON files can be checked by a validating JSON parser to test that the file is syntactically valid.

Main principles

For the ODM JSON implementation of ODM, the following main principles apply:

Just like Like XML, JSON is case-sensitive.
JSON is based on sets of name-value pairs. These are separated by a colon. Name and Value are embedded in double quotes .
Exampleand separated by a colon. Example: "OID": "MyStudy"
XML elements are represented as JSON objects.
Arrays of objects or name-value pairs are represented by and embedded in square brackets.
For example: ["a","b","c"] represents a list of the objects with name "a", "b" and "c".
In JSON, an A JSON object is an unordered set of name-value pairs. An object begins with a left brace '{' and ("{") and ends with right brace '("}'"), preceded by the object name (in double quotes). Each name is followed by colon and the name/value pairs are separated by a comma.
Example.
Arrays of objects or name-value pairs are represented by and embedded in square brackets. For example: ["a","b","c"] represents a list of the objects with name "a", "b", and "c".

Protocol element in JSON example:

Code Block

language	js

"Protocol": {
    "StudyEventRef": [{
        "Mandatory": "Yes",
        "OrderNumber": 1,
        "StudyEventOID": "BASELINE"
    }]
}

The example above This example shows the JSON serialization of the XML element " Protocol " with an array of child " StudyEventRef " elements (as can be seen from the curly brackets, which has the attributes "Mandatory" (with value "Yes"), "OrderNumber" (with value 1) and "StudyEventOID" (with value "BASELINE"). More than one StudyEventRef may be included since it elements. The StudyEventRef element is a JSON object (shown in curly brackets) with values for the Mandatory, OrderNumber, and StudyEventOID attributes. The StudyEventRef element is defined as an array ("StudyEventRef": [...]). Attributes are represented as name-value pairs, such as "Mandatory": "Yes".with brackets) so as many as needed can be included.

Very complex and/or large JSON or XML files my have a single line. Tools for reviewing or hand-authoring of JSON or XML may use line breaks and indentation so that the text is easier to read. Line breaks within Value strings have meaning and must be represented with \n. Indentation or line breaks can be used outside of quoted value strings but have no meaningNote that the indentation is completely arbitrary, and (just like in XML), does not imply anything. Also, line breaks used to format the JSON do not have a meaning: very complex JSON or XML files of 1 GB in size can just consist of one single line. However, line breaks within strings (content surrounded with double quotes), have meaning, for example line breaks are not allowed and must be replaced with \n.

XML text content is treated as a name-value pair with the name being "_content".
Example:

Code Block

language	js

"StudyNameReasonForChange": {"_content": "TestInvestigator Studytyping 003error"}

and combined with the parent element "GlobalVariablesAuditRecord":

Code Block

language	js

"GlobalVariablesAuditRecord": {
    "StudyNameUserRef": {"_contentUserOID": "Test Study 003USR.001"},
    "LocationRef": {"LocationOID": "StudyDescription"LOC.007"},
	"DateTimeStamp": {"_content": "Test Study 003 created by API2022-07-03"},
      
	"ProtocolNameReasonForChange": {"_content": "TestInvestigator Study 003 created by APItyping error"} 
}

Namespaces are ignored.
For ODM, this essentially means that the attribute "xml:lang" translates into "lang".
Example representing the ODM-XML Description element with child element TranslatedText, having the xml:lang attribute with the value "en" and the text content "Unique identifier for a study":

Code Block

language	js

"Description": {
    "TranslatedText": [{
        "lang": "en",
        "_content": "Unique identifier for a study."
    }]
}

...

As usual in JSON, the root element is not explicitly named:.
Example representing the ODM element with attributes CreationDateTime, Description, FileOID, FileType, Granularity, ODMVersion, and Originator:

Code Block

language	js

{
    "CreationDateTime": "2011-10-24T10:05:00",
    "Description": "JSON test",
    "FileOID": "JSON_Test_2020",
    "FileType": "Snapshot",
    "Granularity": "Metadata",
    "ODMVersion": "2.0",
    "Originator": "MySystem",
    ...
    ...
}

Representing the ODM element with attributes "CreationDateTime", "Description", "FileOID", "FileType", "Granularity", "ODMVersion", and "Originator".

Dataset-JSON

Dataset-JSON is based on the Dataset-XML specification, but represents a different approach from the one described above. It utilizes JSON format specifics to efficiently store data. Each Dataset-JSON file is connected with a Define-XML file, containing detailed information about the metadata. One aim of Dataset-JSON is to address as many of the relevant requirements in the PHUSE 2017 Transport for the Next Generation paper as possible, including the efficient use of storage space.

At the top level of Dataset-JSON object, there are two optional attributes: clinicalData, referenceData, corresponding to Dataset-XML elements.

Code Block

language	js

{
    "clinicalData": { ... },
    "referenceData": { ... }
}

Each of these attributes contains study and metadata OIDs as well as an object describing one or more item groups (datasets). Values of the studyOID and metaDataVersionOID must match corresponding values in the Define-XML file.

Code Block

language	js

{
    "clinicalData": {
        "studyOID": "xxx",
        "metaDataVersionOID": "xxx",
        "itemGroupData": { ... }
}

itemGroupData is an object with attributes corresponding to individual datasets. The attribute name is OID of a described dataset, which must be the same as OID of the corresponding itemGroup in the Define-XML file.

Code Block

language	js

"itemGroupData": { 
    "IG.DM": { ... }
}

The dataset description contains basic information about the dataset itself and its items.

records - the total number of records in a dataset
name - dataset name
label - dataset description
items - basic information about variables
itemData - dataset data

Code Block

language	js

"IG.DM": {
    "records": 100,
    "name": "DM",
    "label": "Demographics",
    "items": [ ... ],
    "itemData": [ ... ]
}

items is an array of basic information about dataset variables. The order of elements in the array must be the same as the order of variables in the described dataset.

OID - OID of a variable (must correspond to the variable OID in the Define-XML file)
name - variable name
label - variable description
type - type of the variable. One of 'string', 'integer', 'float', 'double', 'boolean'. See ODM types for details.

Code Block

language	js

"items": [
    {
        "OID": 100,
        "name": "DM",
        "label": "Demographics",
        "type": "float",
    },
    ...
]

itemData is an array of records with variables values. Each record itself is also represented as an array of variables values.

Code Block

language	js

"itemData": {
   ["MyStudy", "001", "DM", 56],
   ["MyStudy", "002", "DM", 26],
}

Missing values are represented by empty elements of an array: ["MyStudy", , "DM",]

The full example of a Dataset-JSON file:

...

language	js

...

Pagenav2

Page tree

Versions Compared

Old Version 2

New Version Current

Key

Introduction

Main principles

Dataset-JSON

Page tree

Page History

Versions Compared

Old Version 2

New Version Current

Key

Introduction

Main principles

Dataset-JSON