Exploring the Role of Covariate Adjustment in Modern Clinical Trials: Insights and Implications – discussion with Dr. Ting Ye
Erik Bloomquist (Merck), Maria Kudela (Pfizer)
Highlights
Discover how Dr. Ting Ye's innovative research in covariate adjustment is transforming the efficiency and precision of randomized clinical trials.
Learn about the RobinCar R package, a comprehensive toolkit developed by Dr. Ye's team, which is widely used in ongoing registrational trials.
Explore Dr. Ye's journey and the impactful contributions her work has made to regulatory thinking and modern clinical trials
Ting Ye is an Assistant Professor in Biostatistics at the University of Washington. Her research aims to accelerate human health advances through data-driven discovery, development, and delivery of clinical, medical, and scientific breakthroughs, spanning the design and analysis of complex innovative clinical trials, causal inference in biomedical big data, and quantitative medical research. Ting is a recipient of the School of Public Health's Genentech Endowed Professorship and the NIH Maximizing Investigators' Research Award (MIRA). She is a leader in covariate adjustment for randomized clinical trials and has published over ten papers, including four in top-tier journals such as JASA, JRSSB, and Biometrika. Prior to joining UW, Dr. Ye completed a PhD in Statistics at the University of Wisconsin-Madison in 2019 and a postdoctoral fellowship in Statistics at the University of Pennsylvania in 2021.
We recently spoke with Dr. Ye about advancements in covariate adjustment analysis and her journey in that field.
Everyone remembers covariates as part of that basic regression course we all took in school. Yet when we joined pharma and industry, no one ever used covariates for clinical trials. “Randomization will take care of it” our bosses would say. Now, with the FDA guidance coming out, your advances, and the start of the covariate working group, it seems we were wrong. What’s going on here?
Reply: That's a great question! It is true that randomization would balance everything out and unadjusted analysis (like difference in outcome means) can correctly estimate the unconditional treatment effect.
In traditional regression courses, we’re taught that a coefficient’s interpretation is conditional on the other variables in the model and relies on correct model specification. This has understandably led to some hesitation around regression-based approaches, as the results can be sensitive to the choice of covariates and the correctness of the model.
However, advances in causal inference and the establishment of the estimand framework in ICH E9(R1) have begun to clearly disentangle the concept of the estimand from the choice of statistical analysis methods. In clinical trials, thanks to randomization, many researchers have proposed robust, model-assisted approaches to improve efficiency. These developments address the two concerns associated with classic regression. Specifically, one can now define the estimand based on the clinical question of interest first, then apply pre-specified covariate adjustment methods that target that same estimand under the same assumptions used in unadjusted analyses. Methods along this line are transparent, robust, and have real, practical value in improving power and precision.
How did you get involved in this area? Was it related to the FDA guidance coming out?
Reply: In grad school, I really liked the 2010 Biometrika paper by Prof. Jun Shao on covariate-adaptive randomization, so I started working on theory for survival analysis under covariate-adaptive randomization (e.g., stratified permuted block) as part of my dissertation in 2018. That work culminated in a paper (Ye and Shao, 2020; JRSSB) where we examined the behavior of common tests under covariate-adaptive randomization. We found that the log-rank test and Lin and Wei’s score test tended to be conservative under these designs. To address this, we proposed adjustments to correct the conservativeness. But what really surprised us was an elegant twist: the stratified log-rank test, contrary to what one might expect, turned out to be valid—not conservative—under covariate-adaptive randomization. This insight was exciting, as it suggested a way to handle the complexities of Pocock and Simon’s minimization without directly confronting its intricate theoretical properties. Instead, we could leverage statistics whose asymptotic behavior remains invariant to the randomization schemes.
Building on this idea, we extended our analysis to non-censored outcomes and demonstrated that the post-stratification estimator remains valid under covariate-adaptive randomization. While post-stratification is a form of covariate adjustment, it isn’t fully general. As we dug deeper, I became intrigued by the broader field of covariate adjustment, inspired in particular by the influential 2008 papers by Freedman and Lin. Motivated by their ideas, we explored model-assisted estimation, where models are used to improve efficiency, but validity holds even if the models are misspecified. This led to our second paper (Ye et al., 2022; Biometrika), which we developed at full speed during December 2019 and quickly submitted.
After completing that project, a new idea took shape. We realized that post-stratification estimators could be viewed as special cases within a broader class of linearly-adjusted estimators. What began as an attempt at a brief follow-up evolved into a much deeper study. We ended up characterizing a complete class of linearly-adjusted estimators, deriving their joint asymptotic distribution for multiple treatment arms, and identifying the most efficient estimator of this class—the ANHECOVA estimator (Analysis of Heterogeneous Covariance). We also uncovered the interplay between covariate adjustment at the design and analysis stages. This project was a turning point for me. I learned so much while writing this paper with Qingyuan Zhao, whose deep insights and style in writing truly elevated the work. We posted the preprint on arXiv in September 2020, and it was later published in JASA in 2023. Alongside it, we released the RobinCar package on GitHub to make these tools accessible to practitioners.
In May 2021, the FDA released its draft guidance on covariate adjustment. We submitted a public comment (https://www.regulations.gov/comment/FDA-2019-D-0934-0034), and two years later, the final guidance cited two of our papers. It was a meaningful moment—seeing our statistical contributions influence regulatory thinking.
Despite the progress we had made in linear adjustment, we weren’t satisfied. Nonlinear methods like G-computation and covariate adjustment for time-to-event data lacked some of the desirable properties we had come to appreciate—guaranteed efficiency gains, robustness, and universal applicability. Realizing that ANHECOVA could serve as a unifying core, we extended its use to calibrate predictions from nonlinear models (Bannick et al., 2025; Biometrika) and to develop covariate adjustment methods for survival outcomes (Ye et al., 2024; Biometrika). Together, this body of work provides a comprehensive toolkit for improving precision in treatment effect testing and estimation.
For those new to covariate adjustment, is it the same thing we learned in graduate school? For example, for continuous endpoints, can I just use the R command (lm~x1+x2+x3) command in R? Or is it more advanced?
Reply: Covariate adjustment is generally not the same as simply obtaining coefficient estimates from a fitted regression model. For continuous or discrete outcomes, covariate adjustment typically involves two steps: first, fitting a linear or nonlinear regression model; and second, using the fitted model to construct an estimator of the treatment effect (such as the g-computation estimator described in the FDA’s final guidance).
For example, when using an ANCOVA model of the form lm(Y ~ A + X), where A is the treatment indicator and X represents covariates, the covariate-adjusted estimator coincides with the coefficient on A. However, when fitting an ANHECOVA model with treatment-by-covariate interactions, such as lm(Y ~ A + X + A:X), the covariate-adjusted estimator is no longer simply one of the coefficient estimates. (A fun fact: it can be retrieved as a coefficient if the covariates X are centered.)
That said, even in these models, variance estimates for the covariate-adjusted treatment effect cannot be directly obtained from standard linear model output in R.
What types of packages in R and SAS are available for this adjustment? Do I need to start from scratch or can I rely on existing software?
Reply: To make these methods widely accessible, our group initiated the development of RobinCar, an open-source R package that consolidates tools for linear and nonlinear covariate adjustment, as well as methods for time-to-event outcomes (available on CRAN and GitHub, >400 downloads/month on CRAN). The R package is rigorously validated to comply with good clinical and software practices, actively maintained based on user feedback, and includes comprehensive documentation and vignettes. As far as I know, RobinCar has been used in numerous ongoing registrational trials.
More recently, the ASA biopharmaceutical section (ASA-BIOP) covariate adjustment scientific working group is developing a lite version of RobinCar, called RobinCar2. Together, these form the RobinCar family: RobinCar is designed to be the most comprehensive, keeping pace with the latest methods in the literature and tested actively by users, while RobinCar2 will include a curated subset of well-validated methods.
I work in the area of oncology and hematology, where survival analysis is still the primary methodology, e.g. log-rank tests and coxph? Is there anything different here?
Reply: You're absolutely right—covariate adjustment in survival analysis is different from that in standard settings. The first step is to clearly define the estimand of interest: for example, the unconditional survival functions for each treatment arm, the unconditional hazard ratio, or the stratified hazard ratio. Once the estimand is identified, covariate adjustment can be applied to improve efficiency—reducing variability without changing the estimand or requiring additional assumptions.
To give a concrete example: the log-rank test is commonly used to compare unconditional hazard functions between two arms. Covariate adjustment methods proposed by Lu and Tsiatis (2008, Biometrika) and Ye et al. (2024, Biometrika) work by linearizing the score equation (i.e., the numerator of the log-rank test statistic) to generate "derived outcomes," to which ANHECOVA adjustment is then applied. As another example, if the estimand of interest is the unconditional hazard ratio defined by a Cox model with only a treatment indicator, the unadjusted estimator solves the corresponding score equation. In this case, one can linearize the score equation, apply ANHECOVA adjustment to the resulting estimating equations, and solve the adjusted score equation to obtain the covariate-adjusted estimator of the unconditional hazard ratio. Similar approaches can be extended to stratified log-rank tests and stratified Cox models, as shown in Ye et al. (2024, Biometrika). A key feature of these methods is their ability to disentangle the estimand from the analysis method. These methods are also implemented in the RobinCar R package.
If I bring this up to my team, and they ask, “what are the advantages of adjustment?” what should I say? For example, can I say we’ll see a 15% reduction in sample size? Quicker milestones?
Reply: Adjusting for pre-specified baseline covariates that are prognostic of the outcome is generally a robust approach and often leads to efficiency gains. The extent of the gain depends on how strongly the covariates are associated with the outcome. If historical data on outcomes and covariates are available, one can estimate the potential relative efficiency gain or sample size reduction—and even construct confidence intervals for these estimates (Li et al. 2023; JRSSB).
Ultimately, it is up to the sponsor to decide whether to base the sample size calculation on the unadjusted analysis and treat covariate adjustment as a source of additional efficiency, or to base it on the adjusted analysis, or take an approach somewhere in between.
Even if the sample size is calculated based on the unadjusted analysis, covariate adjustment can still lead to higher power for detecting a treatment effect. In trials with planned interim analyses, this gain in power can translate into a higher probability of rejecting the null hypothesis early, potentially leading to faster conclusions about efficacy.
Ok let’s say I want to incorporate this into my protocol, what things should I consider when choosing these covariates? Any downsides or risks to doing this?
Reply: Covariates used for adjustment should be measured at baseline and pre-specified. Ideally, a small number of covariates that are most prognostic of the outcome should be selected, along with any stratification factors used in stratified or covariate-adaptive randomization, especially if they are believed to be related to the outcome. The selection of covariates can be informed by prior knowledge of the disease or condition, or by analysis of historical data.
With a small number of carefully-chosen covariates and if one uses methods that have guaranteed efficiency gain over unadjusted estimator (e.g., the ANHECOVA estimator), the risk is minimal.
However, if a large number of covariates are considered and it’s not possible to narrow down the list, including too many covariates may lead to type I error inflation or a loss of power. In such cases, data-adaptive variable selection methods can be applied when fitting the regression models. To maintain valid inference, cross-fitting should be used to ensure consistency of the estimator (Bannick et al., 2025; Biometrika). How to best use data-adaptive procedures in covariate adjustment is still an active research topic; see Van Lancker (2024) for a recent preprint on this topic.
Any additional resources or written material available for those who want to learn more?
Reply: Our ASA-BIOP covariate adjustment scientific working group has posted very nice blogs and tutorials, which are available at: https://carswg.github.io/blog.html. There are also great review papers on the topic, e.g. Van Lancker et al. (2024).
Where do you see this area moving in the next 5 years? Will covariate adjustment become the standard for clinical trials?
Reply: I believe there is still a need for further research on covariate adjustment in more complex settings, such as interim analyses, data-adaptive variable selection, and more complex trial designs. It would also be valuable to see more papers that address the practical implementation of covariate adjustment, including empirical evaluations using realistic simulations or real-world data.
As covariate adjustment continues to gain traction, I expect it will become the standard approach in clinical trials.
As we conclude our discussion, it's evident that Dr. Ye's innovative contributions have significantly impacted the field of randomized clinical trials. Her work on covariate adjustment and the development of the RobinCar R package are not only advancing the efficiency and precision of trials but also shaping the future of clinical research. Dr. Ye's dedication and pioneering spirit continue to inspire progress, promising a more robust and reliable landscape for clinical trials. Thank you for sharing your invaluable insights and experiences with us.
References:
Shao, J., Yu, X. & Zhong, B., (2010). A theory for testing hypotheses under covariate-adaptive randomization. Biometrika, 97(2), pp.347-360.
Ye, T., & Shao, J. (2020). Robust Tests for Treatment Effect in Survival Analysis under Covariate-Adaptive-Randomization. Journal of the Royal Statistical Society: Series B, 82(5):1301-1323.
Lin, W. (2013). Agnostic notes on regression adjustments to experimental data: Reexamining Freedman’s critique. Ann. Appl. Stat. 7(1): 295-318.
Freedman, D. A. (2008). On regression adjustments to experimental data. Adv. in Appl. Math. 40 180–193.
Freedman, D. A. (2008). On regression adjustments in experiments with several treatments. Ann. Appl. Stat. 2 176–196
Ye, T., Yi, Y., & Shao, J. (2022). Inference on Average Treatment Effect under Minimization and Other Covariate-Adaptive Randomization Methods. Biometrika, 109(1):33–47.
Ye, T., Shao, J., Yi, Y., & Zhao, Q. (2023). Toward better practice of covariate adjustment in analyzing randomized clinical trials. Journal of the American Statistical Association, 118(544):2370–2382.
Bannick, M. S., Shao, J., Liu, J., Du, Y., Yi, Y., & Ye, T. (2025). A General Form of Covariate Adjustment in Randomized Clinical Trials. Biometrika (accepted).
Lu, X. and Tsiatis, A.A., (2008). Improving the efficiency of the log-rank test using auxiliary covariates. Biometrika, 95(3), pp.679-694.
Ye, T., Shao, J., & Yi, Y. (2024). Covariate-Adjusted Log-Rank Test: Guaranteed Efficiency Gain and Universal Applicability. Biometrika, 111(2):691-705.
Li, X., Li, S. and Luedtke, A., (2023). Estimating the efficiency gain of covariate-adjusted analyses in future clinical trials using external data. Journal of the Royal Statistical Society Series B: Statistical Methodology, 85(2), pp.356-377.
Van Lancker, K., Díaz, I. and Vansteelandt, S., (2024). Automated, efficient and model-free inference for randomized clinical trials via data-driven covariate adjustment. arXiv preprint arXiv:2404.11150.
Van Lancker, K., Bretz, F. and Dukes, O., (2024). Covariate adjustment in randomized controlled trials: General concepts and practical considerations. Clinical Trials, 21(4), pp.399-411