How Can Statisticians and Data Scientists Influence Clinical Trials to be More Representative?
Ning Leng (Roche-Genentech), Dooti Roy (Boehringer Ingelheim), Shemra Rizzo (Roche-Genentech), Godwin Yung (Roche-Genentech)
The authors extend their gratitude to Atasi Poddar (FDA) for her insightful contributions to the roundtable discussion contents and for sharing the list of FDA resources. The contents of this publication reflect the thoughts of the authors and do not represent the official views of, nor the endorsement by, the FDA, department of Health and Human Services, or US government.
During RISW 2024, we had a lively roundtable discussion on this topic among industry and FDA professionals.
Currently, stringent inclusion/exclusion (I/E) criteria are often applied in first-in-human (FIH) Phase 1 trials to minimize risk of serious side effects in patients. When an experimental therapy progresses from Phase 1 into later phases, the same set of I/E criteria is then carried over because of the lack of safety/efficacy information in a wider population. Efficacy and safety of excluded populations are often not tested until Phase 4 trials. Thus, these patients have limited access to the medical product until years after its first approval.
But is it sufficient to start considering inclusiveness only at the pivotal trial stage? If your Phase 2 trial adopts the narrow I/E criteria from your Phase 1 FIH trial, you may encounter a population shift when transitioning from Phase 2 to Phase 3. Would it be more advantageous to include a broader, more representative population during dose optimization? In the following sections, we explore how statisticians can help guide the broader team in deciding when to consider a more inclusive population, as well as strategies for evidence generation and risk management.
Evidence Generation for Decision-Making
When expanding the trial population beyond the restricted FIH population, three critical questions often arise:
Prevalence: What percentage of eligible patients would be impacted by this expansion?
Efficacy: How would expanding the population affect control arm performance? How might it influence the population treatment effect?
Safety: Does a broader population increase the risk of more severe safety events?
The prevalence question is typically the easiest to address. Data can often be sourced from literature, claims data, real-world data (RWD), or previous trial data. For well-defined criteria, such as age, gender, or lab values, these answers are relatively straightforward. However, more ambiguous criteria, like a history of "moderate" or "severe" disease, may require additional data processing and derivation. This can introduce complexities but addressing them is essential for making informed decisions about population expansion.
When assessing efficacy in an expanded patient population, two critical questions need to be addressed:
1) Is the patient group expected to have a prognostic effect on both the treatment and control arms? If earlier evidence indicates that the previously excluded patient population experiences poor disease progression regardless of the treatment received, then it is essential to understand how this impacts within-arm performance. Study teams should determine whether these prognostic differences might affect the study's readout timeline and overall outcomes.
2) Do we expect the estimated treatment effect in this previously excluded patient population to differ from that observed in the initially considered population in the FIH trial? If the treatment effect is anticipated to vary, it's necessary to consider the implications for sample size calculations. A different treatment effect may require adjusting the sample size to maintain adequate power and ensure the study can detect meaningful differences.
The choice of data sources hinges on whether there are existing drugs on the market with a similar mechanism of action (MOA). If such drugs are available, RWD or information from previous clinical trials, particularly Phase 4 studies, can be highly informative. When utilizing RWD, it's crucial to account for potential differences in data collection and derivation between real-world settings and clinical trials. For instance, progression-free survival (PFS) measured in the real world might differ from PFS defined by RECIST criteria in clinical trials. A recent study [1] analyzed advanced non-small-cell lung cancer using data from a nationwide database of electronic health records comprising 61,094 patients with advanced non-small-cell lung cancer. The analyses revealed that many common criteria, including exclusions based on several laboratory values, had a minimal effect on the trial hazard ratios and in some cases broadening criteria even improved the trial hazard ratio demonstrating that excluded patients also benefit from treatment.
If the investigational drug has a novel MOA, relying on existing data may not suffice. In this scenario, generating new data becomes necessary. This might involve designing new studies or expanding current trials to include a broader, more representative patient population.
Evaluating safety implications in an expanded population poses additional challenges. Real-world data often contain limited safety information, making them less useful for comprehensive safety assessments. However, if there are other drugs with a similar MOA, legacy clinical trial data can offer valuable insights into potential safety risks and help predict adverse events. In the absence of comparable drugs, new safety data must be generated, possibly through dedicated cohorts or by incorporating sufficient safety monitoring into ongoing trials.
Design and analysis considerations to allow for closer monitoring of unsure populations
As previously discussed, there are many situations where existing RWD or trial data are insufficient to address critical efficacy and safety questions. In these cases, it becomes necessary to enroll the new patient population in clinical studies—either as part of Phase 1b or Phase 2 trials, or through standalone observational studies.
The FDA's Project Optimus presents a unique opportunity to integrate patient population broadening with dose optimization efforts. It's essential to compare different dose options within the target population to determine the most effective therapeutic strategy. Utilizing adaptive trial designs can provide a dynamic means of monitoring the safety and efficacy of the expanded population at various dosages, allowing for real-time adjustments based on emerging evidence.
Although interpreting results from Phase 1b and Phase 2 trials is generally challenging due to a lack of sample size, shrinkage methods may be used to help assess consistency of treatment effect across patient populations. FDA has recently applied shrinkage methods to pivotal trials supporting several approved drugs, the purpose being to provide patients and clinicians with “the most reliable treatment outcomes” (see the following Impact Story [13]).
Covariate adjustment may be another useful tool in the context of increased patient heterogeneity. Specifically, it may help to tease out safety and efficacy signals by accounting for potential differences in baseline characteristics between arms. During Parallel Session 30 “Influencing regulatory policy and guidance development to drive statistical innovation”, Gregory Levin, associate director for statistical science and policy in the Office of Biostatistics at the FDA, called covariate adjustment a “free lunch” and considered FDA’s guidance on this subject [14] to be his “favorite guidance”. Although application of both shrinkage methods and covariate adjustment are currently being considered more in the context of large Phase 3 trials, their utility in Phase 1 and 2 trials should be explored if we are to expand I/E criteria earlier.
Statistician’s role to demystify misconceptions
With FDA’s final guidance Enhancing the Diversity of Clinical Trial Populations —Eligibility Criteria, Enrollment Practices, and Trial Designs Final Guidance [15] on enhancing clinical trial diversity, one relevant topic that we discussed was how disinformation and lack of understanding can sometimes stand in the way of creating and implementing best practices which can drive diversity in clinical trials.
A few such commonly encountered opinions are:
Outreaching and having a more representative screening pipeline of patients may drive statistical bias in estimation of treatment effect of the drug.
If a clinical trial is more diverse, that would mean a dramatically increased sample size to enable a multitude of hypothesis testing for each subgroup of patients.
There is no need to focus special attention on specific groups of patients. Just recruiting in specific countries/regions is enough to address the need of diversity in clinical trials and conducting representative science.
In all the above mentioned scenarios, it typically leads one down the path of “practical considerations” which typically drives these conversations into maintaining the status quo.
In such cases, a statistician or a well-informed data scientist can create awareness and resources to address such knowledge gaps internally and externally in their respective communities.
As planning, and executing a clinical trial is a multi-disciplinary exercise, to drive change in the status quo and improve diverse patient participation and inclusion in future trials, it is critical to create tools, and resources which are easily accessible and digestible by a multitude of stakeholders.
One such powerful tool is a simulator. A well-crafted simulator (e.g., R Shiny App) can serve as a readily accessible tool to test some of these beliefs and provide data-driven evidence when those beliefs are not true.
It is important to differentiate and disseminate reasons to diversify a certain clinical trial that is being planned. One reason is to ensure that the trial provides answers to the critical questions of interest: Is the drug safe and does the drug work? Are the results generalizable to the entire population of interest? Ensuring that the trial participants represent the targeted demographic for the test drug has impact that is both tangible(generalizability) and intangible (building trust in the community by ensuring patient voices are heard). Another reason could be a belief that certain subpopulations may experience a different benefit-risk profile from the given drug. This could be due to specific comorbidities or differential socio-economic factors that could influence the health of a population. In such a case, a properly pre-planned subgroup analysis is recommended, which may require an increased sample size for the subgroup, leading to an overall increase in the sample size of the trial. But in absence of such beliefs, there is no reason to assume, simply diversifying the clinical trial participants to ensure more representativeness, will cause an inflation of sample size.
If there is underlying unknown heterogeneity in the effect sizes across different subgroups, it is beneficial to identify this as early as possible in the drug development process, rather than waiting until the confirmatory stage.
References and list of FDA resources
1. Liu, R., Rizzo, S., Whipple, S. et al. Evaluating eligibility criteria of oncology trials using real-world data and AI. Nature 592, 629–633 (2021). https://doi.org/10.1038/s41586-021-03430-5
2. U.S. Food and Drug Administration. (2024, June). Diversity action plans to improve enrollment of participants from underrepresented populations in clinical studies (Draft guidance). U.S. Department of Health and Human Services.
https://www.fda.gov/
3. U.S. Food and Drug Administration. (2024, January). Collection of race and ethnicity data in clinical trials and clinical studies for FDA-regulated medical products (Draft guidance). U.S. Department of Health and Human Services.
https://www.fda.gov/
4. U.S. Food and Drug Administration. (2023, December). Digital health technologies for remote data acquisition in clinical investigations (Final guidance). U.S. Department of Health and Human Services.
https://www.fda.gov/
5. U.S. Food and Drug Administration. (2023, May). Decentralized clinical trials for drugs, biological products, and devices (Draft guidance). U.S. Department of Health and Human Services.
https://www.fda.gov/
6. U.S. Food and Drug Administration. (2022, March). Inclusion of older adults in cancer clinical trials (Final guidance). U.S. Department of Health and Human Services.
https://www.fda.gov/
7. U.S. Food and Drug Administration. (2020, November). Enhancing the diversity of clinical trial populations—Eligibility criteria, enrollment practices, and trial designs (Final guidance). U.S. Department of Health and Human Services.
https://www.fda.gov/
8. U.S. Food and Drug Administration. (2019, May). Clinical lactation studies: Considerations for study design (Draft guidance). U.S. Department of Health and Human Services.
https://www.fda.gov/
9. U.S. Food and Drug Administration. (2019, May). Postapproval pregnancy safety studies (Draft guidance). U.S. Department of Health and Human Services.
https://www.fda.gov/
10. U.S. Food and Drug Administration. (2018, April). Pregnant women: Scientific and ethical considerations for inclusion in clinical trials (Draft guidance). U.S. Department of Health and Human Services.
https://www.fda.gov/
11. U.S. Food and Drug Administration. (2017, September). Evaluation and reporting of age-, race-, and ethnicity-specific data in medical device clinical studies (Final guidance). U.S. Department of Health and Human Services.
https://www.fda.gov/
12. U.S. Food and Drug Administration. (1994, August). E7 studies in support of special populations: Geriatrics (Final guidance). U.S. Department of Health and Human Services.
https://www.fda.gov/
13. U.S. Food and Drug Administration. (2019, October) Impact Story: Using innovative statistical approaches to provide the most reliable treatment outcomes information to patients and clinicians. U.S. Department of Health and Human Services.
https://www.fda.gov/
14. U.S. Food and Drug Administration. (2023, May) Adjusting for Covariates in Randomized Clinical Trials for Drugs and Biological Products (Final guidance). U.S. Department of Health and Human Services.
https://www.fda.gov/
15. U.S. Food and Drug Administration. (2020, November) Enhancing the Diversity of Clinical Trial Populations — Eligibility Criteria, Enrollment Practices, and Trial Designs Guidance for Industry (Final guidance). U.S. Department of Health and Human Services.
https://www.fda.gov/