Asymptotically Unbiased: Baseline Imbalance in RCTs: To test or not to test?

In the biomedical research world, stakes don't get any higher than randomized trials, where no stone goes unturned. The notion of confounding and baseline imbalance is a frequent source of heartburn, especially for statisticians, as they're often asked about testing for "baseline balance" and "to see if randomization worked."
When we perform hypothesis tests, we are using a sample to infer about a population, for example:

Is the mean of one population equal to a given value? (One sample t-test)
Are the means of two independent populations equal? (Two sample t-test)

In a randomized trial, our sample is from one population, which is divided at random into two groups. If we're testing the equality of means between two populations, a natural question to ask is what two populations are we talking about? Even if we could come up with a sensible answer for this question, the usual purpose for these hypothesis tests is about the comparability of the allocated groups, not populations- something that's beyond the scope of hypothesis testing. If we ever reject the null hypothesis, it's by definition a Type I Error, since both groups are from the exact same population.

Let's set aside the muddled logic behind the tests, and see if they give us some practical value. Suppose the p-value from the resulting test is 0.06: does this mean that the variable is not a confounder? Remember, if the covariate is not related to both treatment and response, it can not be a confounder, and it can be safely ignored. If it is a known confounder from previous studies, we should be adjusting for it anyway, regardless of the degree of balance. Additionally, if we have a large study, we can have a statistically significant result with absolutely no practical significance. The hypothesis test is not only nonsensical, but its results can't even inform us about confounding.

In summary, a statistically significant difference doesn't mean that a variable is confounder, lack of a statistically significant difference doesn't preclude confounding, and any rejection of the null hypothesis is inherently a false positive, or indicative of deep problems with the allocation mechanism- I can't really see the utility of hypothesis testing in randomized trials, even if somehow there were a coherent justification for it.

If a variable is known to be associated with the outcome of interest, it should be included as a covariate in analyses regardless of balance between groups. This is especially important if the response variable (e.g. blood pressure) is measured at baseline, as past history is usually the best predictor of future events.

Don't take my word for it, read it the people who know far more than I do:

Senn, S.J. Testing for Baseline Balance in Clinical Trials. Statistics in medicine 13, 1715-26 (1994).
Altman, D.G. Adjustment for Covariate Imbalance. Encyclopedia of Biostatistics 1000-1005 (1998).
Mülllner, M., Matthews, H. & Altman, D.G. Reporting on Statistical Methods to Adjust forConfounding: A Cross-Sectional Survey. Annals of Internal Medicine 136, 122-126 (2002).
Senn, S.J. BaselineAdjustment in Longitudinal Studies. Encyclopedia of Biostatistics 253-257 (1998).
Schulz, K.F., Altman, D.G. & Moher, D. CONSORT 2010 statement: updated guidelines for reportingparallel group randomized trials. Annals of internal medicine 152, 726-32 (2010).
Senn, S.J. Base Logic:Tests of Baseline Balance in Randomized Clinical Trials. Clinical Research and Regulatory Affairs 12, 171-182 (1995).

Asymptotically Unbiased

Monday, March 4, 2013

Baseline Imbalance in RCTs: To test or not to test?

No comments:

Post a Comment