Harnessing Foundational Data Management Platforms to Accelerate Drug Development

March 16, 2016 John Poonnen

The accelerated drug development cycles that are emerging in the Pharmaceutical/Clinical Research and Development arena mandate a reduced-latency, streamlined and harmonized collection of clinical, pharmaceutical, and patient reported data for operational decision making. The Food and Drug Administration (FDA) and its sister agencies in other countries are also increasingly insisting on the use of extensive, comprehensive, well-managed data (as per current FDA and country-specific agency guidelines) to determine: 

  1. Whether the drug is safe and effective in its proposed use(s), and whether the benefits of the drug outweigh the risks.

  2. Whether the drug's proposed labeling (package insert) is appropriate, and what it should contain.

  3. Whether the methods used in manufacturing the drug and the controls used to maintain the drug's quality are adequate to preserve the drug's identity, strength, quality, and purity.

The beginning of the arduous drug approval process is usually conducted in labs and research institutions. This phase of the approval process is called the Academic and Laboratory Research Phase or the Preclinical phase. This is followed by Clinical Trials Phases I through IV culminating in FDA approval for the drug or not (usually resulting in termination of further interest in the drug's development).


In this early stage, a promising chemical molecule or biotech device is tested on lab animals/rodents/primates. The data generated in this stage is typically low volume, low variability, and limited reporting is needed. In terms of manageability of the data, standard databases/spreadsheets are sufficient. 

Subsequent to this the pharmacuetical company decides whether to put the drug into the human testing (clinical trials) process based on the drug's innovation, medical promise, market feasibility, pharmaceutical competition, societal need for the product and the company’s financial situation. These decisions until now have primarily leveraged the experience, intuition and educated guess of research leadership and business executives.

Clinical Trials

A successful candidate molecule (prospective drug) goes through a well-established, data intensive vetting process to prove to the FDA and other country specific governing bodies that a drug is suitable for release into a diseased subpopulation without safety concerns. A brief overview of the standard clinical trials process is provided below:


Phase 1 Safety Testing: Drug given to a small number of healthy volunteers to test its safety.

  • Data Generated: Relatively small in size
  • Manageability: Standard databases/spreadsheets suffice

 Phase 2 Efficacy Testing: Drug administered to 100 or more people with the disease that it was intended to treat.

  • Data Generated: Medium sized
  • Manageability: Standard databases/spreadsheets suffice

Phase 3 Randomized Clinical Trials: Rigorous testing done on larger groups of patients with the targeted disease.

  • Data Generated: Large Volumes, Diverse, Time bound
  • Manageability: Standard databases/spreadsheets are inadequate and this phase offers the best opportunity for a managed data lake to streamline data management and allow for adaptive trial design

FDA Approval

Upon desirable results from Phase III, a New Drug Application (NDA) will need be submitted to the FDA and other country specific agencies if the drug is marketed worldwide. The NDA contains data supporting the efficacy and safety of the drug. FDA Approval can take anywhere from 2 months to several years, but on average, it takes around 18 to 24 months. Drugs are subject to ongoing review, making sure no adverse side effects appear from the drug. After the FDA’s approval, the drug can be marketed and distributed. 

Phases 1 through 3 culminate in FDA Review – Approval/Disapproval

Phase 4 Late Phase Monitoring: Monitoring of the drug’s actual use, efficacy and safety in the broader population with serious adverse events being reported and dealt with to the satisfaction of the FDA is mandatory for 2 years or more.

  • Data Generated: Large Volumes, including social media and public health agency reported data
  • Manageability: Standard databases/spreadsheets are totally inadequate and a managed data lake is needed to derive insights and analyze consumer sentiment towards a new drug

Key success factors for a pharma company:

  • One or more successful products on the market that have patent protection (typically 10 years)

  • A large pipeline of candidate drugs with some in late-stage Phase III status for ongoing sustenance

  • The ability to find and recruit patients for the clinical trials in a timely and competitive manner

  • The ability to manage the vast amounts of data generated in Phase III and IV and remain in compliance with multinational regulations for patient safety and good clinical practice

  • Cash to fund the development of their new drug candidates

Key points

The data used in healthcare/drug development is becoming tremendously overwhelming to handle not only because of its volume, but also because of the diversity of data types (image, clinical, lab, sensor data) and the reduced ingest latency at which it must be managed. The process involves data at every stage. Every bit of process streamlining, automated data management, managed data lake infrastructure and advanced analytics will need to be brought to bear on this data tsunami. It is abundantly clear that the “data management” table stakes for Pharma R&D are being raised to new levels in the immediate future.

To learn more about managed data lakes, foundational data management platforms for health and life sciences, watch our on demand webinar on Health Informatics and Managed Data Lakes:

Watch the Webinar


About the Author

John  Poonnen

Managing Principal, Nexor Technologies Inc.

More Content by John Poonnen
Previous Article
Foundational Differentiators in the Health Data Ecosystem for Pharma/CRO
Foundational Differentiators in the Health Data Ecosystem for Pharma/CRO

Foundational differentiators in the health data ecosystem for pharma/CRO Leading pharmaceutical companies a...

Next Article
Data Lakes Can Ensure That the Promise of Personalized Medicine is Here to Stay

Data Lakes can ensure that the promise of personalized medicine is here to stay As early as 2007 the FDA a...


Get a custom demo for your team.

First Name
Last Name
Phone Number
Job Title
Comments - optional
I would like to subscribe to email updates about content and events
Zaloni is committed to the best experience for you. Read more on our Privacy Policy.
Thank you! We'll be in touch!
Error - something went wrong!