When we perform hypothesis tests, we are using a sample to infer about a population, for example:
- Is the mean of one population equal to a given value? (One sample t-test)
- Are the means of two independent populations equal? (Two sample t-test)
Let's set aside the muddled logic behind the tests, and see if they give us some practical value. Suppose the p-value from the resulting test is 0.06: does this mean that the variable is not a confounder? Remember, if the covariate is not related to both treatment and response, it can not be a confounder, and it can be safely ignored. If it is a known confounder from previous studies, we should be adjusting for it anyway, regardless of the degree of balance. Additionally, if we have a large study, we can have a statistically significant result with absolutely no practical significance. The hypothesis test is not only nonsensical, but its results can't even inform us about confounding.
In summary, a statistically significant difference doesn't mean that a variable is confounder, lack of a statistically significant difference doesn't preclude confounding, and any rejection of the null hypothesis is inherently a false positive, or indicative of deep problems with the allocation mechanism- I can't really see the utility of hypothesis testing in randomized trials, even if somehow there were a coherent justification for it.
If a variable is known to be associated with the outcome of interest, it should be included as a covariate in analyses regardless of balance between groups. This is especially important if the response variable (e.g. blood pressure) is measured at baseline, as past history is usually the best predictor of future events.
Don't take my word for it, read it the people who know far more than I do:
- Senn, S.J. Testing for Baseline Balance in Clinical Trials. Statistics in medicine 13, 1715-26 (1994).
- Altman, D.G. Adjustment for Covariate Imbalance. Encyclopedia of Biostatistics 1000-1005 (1998).
- Mülllner, M., Matthews, H. & Altman, D.G. Reporting on Statistical Methods to Adjust forConfounding: A Cross-Sectional Survey. Annals of Internal Medicine 136, 122-126 (2002).
- Senn, S.J. BaselineAdjustment in Longitudinal Studies. Encyclopedia of Biostatistics 253-257 (1998).
- Schulz, K.F., Altman, D.G. & Moher, D. CONSORT 2010 statement: updated guidelines for reportingparallel group randomized trials. Annals of internal medicine 152, 726-32 (2010).
- Senn, S.J. Base Logic:Tests of Baseline Balance in Randomized Clinical Trials. Clinical Research and Regulatory Affairs 12, 171-182 (1995).