Purpose

Large trials and studies generate many analysis results in the form of tables, figures and written reports. Historically, a typical workflow for producing analysis results involves the end user generating the display in a static format such as RTF or PDF from the Analysis Data Model (ADaM) dataset (Figure 1). The Analysis Results Metadata (ARM) for Define-XML (add reference) is then created retrospectively to provide high-level documentation about metadata relating to the analysis displays and results. However, there is no formal model or structures to describe analysis results and associated metadata, leaving a gap in standardization. The current process is expensive, time-consuming, lacks automation and traceability, leading to unnecessary variation in analysis results reporting.

historical process

Figure 1: Example of Current Workflow

The goal for the future state of analysis results is that they are machine-readable, easily navigable, and highly reusable. Our aim was to create a logical model that fully described analysis results and associated metadata to support the following objectives:

Automated generation of machine-readable results data
Improved navigation and reusability of analysis and results data
Support for the storage, access, processing, and reproducibility of results data
Traceability to the study protocol, statistical analysis plan (SAP), and to the input ADaM dataset

The Analysis Results Standard (ARS) Model has several possible implementations including leveraging analysis results metadata to aid in automation as well as representing analysis results as data in a dataset structure. The creation of an ARS technical specification could be used support automation, traceability, and the creation of data displays. An analysis results dataset could support reuse and reproducibility of results data. The following is an example of how the ARS Model could be used in a modernized workflow that shifts the focus from retrospective reporting to prospective planning (Figure 2).

Future Process

First, an end-users could use the ARS Model to guide in the generation of a technical specification prior to generating a display, rather than after the display has been created. This approach will allow for better planning and standardization of the analysis process, resulting in more consistent and traceable reporting. The technical specification could include metadata about the statistical methods, data sources, and displays to be generated. Once the technical specification has been developed, the end-user can use it to generate an analysis results dataset, which contains the results data needed to generate the display. The analysis results dataset could be designed to support reuse and reproducibility of the results data, enabling more efficient and effective analysis reporting.

Finally, the machine-readable analysis results dataset serves as the ‘single source of truth’ capturing the analysis results metadata and results data in a standardized format. This dataset can then be used to generate displays for multiple reporting purposes, such as traditional analysis reporting for the clinical study report (CSR), in-text tables for the CSR, safety reporting, meta-analyses, dynamic applications, ClinicalTrials.gov, publications, and presentations. This streamlined approach ensures consistency and accuracy in the generation of displays across various deliverables, making it more efficient and reliable for reporting and communication of analysis results.

Overall, this new workflow would enable end-users to generate analysis results metadata prospectively, with greater standardization, consistency, and traceability of analysis results reporting, enabling better decision-making and regulatory submissions. Shifting the focus from retrospective reporting to prospective planning would help to address many of the current limitations of analysis results reporting and support the development of more efficient and effective analysis standards.

Figure 2: Workflow with Future Extensions and Use Cases

Use of LinkML for the development of the ARS Logical Model

Many of the same metadata components are needed both to create a prospective technical specification of analyses to be performed and to give context to reported results. Therefore a single logical model comprising components needed to specify analyses, to represent contextualized results and to indicate how the results are displayed. The logical model is being developed using LinkML.

LinkML is an open-source schema development language and framework for generating machine-readable models ^[²^]. The LinkML Generator framework generates downstream artifacts, including JSON-Schema, ShEx, RDF, OWL, GraphQL, and SQL DDL.

The logical model acts as a blueprint or a set of rules that describes how things should be organized and structured. In our case, it is being used to create a model that describes how analysis results data should be organized and structured. LinkML has allowed the creation of a standardized and consistent approach to describing analysis results data, enabling greater interoperability and collaboration across different stakeholders. Additionally, it has the added benefit of being able to convert them into various machine-readable formats such as JSON, YAML, OWL, or XML.

Furthermore, LinkML is a flexible and extensible tool, meaning that the model can be easily modified as needed to incorporate new data elements or requirements. By creating a machine-readable model, analysis results data can be more easily understood, shared, and reused by both humans and machines. LinkML will also support the development of validation rules that can be used to ensure the integrity and quality of the data, which is essential in our highly regulated pharmaceutical industry.

Page tree

Purpose

Figure 2: Workflow with Future Extensions and Use Cases

Use of LinkML for the development of the ARS Logical Model