Express Pharma

Clinical Data Standards in the era of AI, ML and digital transformation

Shrishaila Patil, VP, Statistical Programming, Navitas Data Sciences, a part of Navitas Life Sciences (a TAKE Solutions Enterprise) highlights how Clinical Trial Data Standards have evolved so far, and how they are continuing to evolve in the era of artificial intelligence (AI), machine learning (ML) and digital transformation to meet future demands

0 1,748

The need of the hour is to reach patients faster. By reducing both the drug development timeline and the overall cost of pharma research and development we can achieve just that. COVID-19 has been a good example of just what can be achieved. Vaccines were developed within a year, pre-COVID, this was unheard of with vaccines typically taking many years to develop.

For biometrics teams, this means that we need to be quick and accurate in data collection, processing and analysis. In order to cut short the timeline and improve efficiency, we need to automate many steps involved in the Clinical Data Life Cycle (planning phase, data collection, tabulation, statistical analysis, and exchange/sharing of data) and, in order to automate, we must have consistent metadata, standards, and technology.

This article highlights how Clinical Trial Data Standards have evolved so far, and how they are continuing to evolve in the era of artificial intelligence (AI), machine learning (ML) and digital transformation to meet future demands.

COVID-19 and the wave of digital transformation

We can all agree that COVID-19 has acted as something of a catalyst, encouraging increased innovation, technology adoption, and a willingness to embrace digital transformation. As a result, we are witnessing a “new normal” – including an increased number of virtual trials being successfully designed, a move away from the more conventional clinical trials, in order to manage the global pandemic situation and ensure the continuation of clinical trials for many other, potentially life-saving, drugs. Virtual trials have helped patient recruitment, retention, real-time access to data, and better quality.

Digital data collection methodologies (mobile technology, wearables, electronic patient-reported outcomes (ePRO), electronic clinical outcome assessment (eCOA) etc.) have been instrumental, acting as game changers to enable robust data capture in the era of COVID-19.

Why Data Standards?

As clinical research becomes more increasingly complex, the opportunity to bring clarity to the data is more important than ever. A true measure of the data is the impact it has.

Science comes to life through data. Data doesn’t mean anything if you have to struggle to understand where it is located, how it is organised, or how to analyse it. One cannot harmonise anything or combine without standards.

Standardisation helps in data aggregation, accessibility, interoperability, re-usability and traceability. Ultimately, standardisation helps regulators to focus on their scientific review and make patient centric decisions.

We need end-to-end data standardisation and integration strategy that considers all the dimensions of clinical data.

Clinical Data Interchange Standards Consortium (CDISC): There are many standards development organisations. When it comes to Clinical Trial Data Standards, CDISC has contributed significantly over the last two decades.

CDISC has played a significant role over the last two decades to achieve data quality (in order for us to trust the data to make credible and significant scientifically valid decisions) and also gain efficiency across clinical trial data life cycle processing.

Originally formed in 1997 as a volunteer organisation, CDISC has brought together experts in the industry to align on common data structures and data content spanning both non-clinical (animal) and clinical (human) studies.

CDISC standards are widely used across the biopharma industry and have become a requirement for data submissions to many health authorities. Some of the key CDISC standards are:

Foundational Standards – CDISC Foundational Standards are the basis of a complete suite of data standards, enhancing the quality, efficiency, and cost effectiveness of clinical research processes from beginning to end. i.e. Protocol Representation Model (PRM), Standard for Exchange of Nonclinical Data (SEND), Clinical Data Acquisition Standards Harmonization (CDASH), Study Data Tabulation Model (SDTM), Analysis Data Model (ADaM) and Questionnaires, Ratings and Scales (QRS).

Data Exchange Standards facilitate the sharing of structured data across different information systems. i.e. Clinical Trial Registry (CTR)-XML, Operational Data Model (ODM)-XML, Study/Trial Design Model in XML (SDM-XML), Define-XML, Dataset-XML and Resource Description Framework (RDF, provides executable, machine-readable CDISC standards from CDISC Library).

CDISC Controlled Terminology (CT) is the set of CDISC-developed or CDISC-adopted standard expressions (values) used with data items within the Foundational Standards and Therapeutic Area User Guides. CDISC Terminology provides context, content, and meaning to clinical research data and provides a consistent semantic layer across all operational contexts, enabling interoperability of the CDISC Standards.

Therapeutic Area User Guides (TAUGs) extend the Foundational Standards to represent data that pertains to specific disease areas. TAUGs include disease-specific metadata, examples, and guidance on implementing CDISC standards for a variety of uses, including global regulatory submissions. Therapeutic Area (TA) expertise becomes more and more important for clinical data scientists working in the pharma industry as it is crucial for the understanding of patients’ needs and the interpretation of analysed data.

Originally focusing on common data domains in clinical trials (e.g. demographic information, adverse events, routine lab results, and subject status), CDISC has grown from PDFs to machine readable standards, and from few safety domains to more therapeutic area-specific standards (a great example is COVID-19 guidelines).

CDISC was designed to have built in quality starting much early in the process. At this point, CDISC continues to evolve in evaluating a new source of data, referred to as ‘Real World Data’ (RWD), which includes data such as electronic medical records, insurance claims data, and wearable devices.

The road ahead

Since the formation of CDISC, significant progress has been made in standardising the format of data collected, analysed and submitted to the health authorities. However, one of the challenges faced was that CDISC foundational standards were built in a two-dimensional model, with and no significant CDISC support for the automation of foundational standards in the research enterprise.

One of the strategic goals for CDISC to address in the coming years is to “Develop multidimensional standards in an open, transparent manner that allows community members to transition with as little disruption to their research as possible while unlocking greater benefits of standardisation. Engage in concrete steps to achieve end-to-end standardisation.” Some of the initiatives in place to achieve this are:

  • CDISC 360 (to demonstrate the feasibility of standards-based metadata-driven automation across the end-to-end clinical research data life cycle)
  • Evolve the expression of foundational conformance rules to an electronic format to increase consistency and instantiate multidimensional model artifacts in the CDISC Library.
  • Initiate a process to build the model for machines first, people second.
  • Commit to develop only end-to-end TAUGs

Another key strategic goal of CDISC is to expand and identify adjacent research areas that can benefit from data standardisation. i.e. with the evolution of the model, selectively extend CDISC standards to support new data types and/or new technologies.

Expertise in RWD/Real World Evidence (RWE)

  • Consumer wearables
  • Medical devices
  • Augment/replace patient-reported outcomes data from consumer wearables and/or medical devices
  • Device registry, likely via collaboration, that uniquely identifies devices and enables automated mappings to CDISC standards
  • CDISC-compliant registry toolkit that is built on the CDISC Library API
  • ‘Mapping registry’, which standardizes the conversion of proprietary device data to CDISC standards