When we perform hypothesis tests, we are using a sample to infer about a

*population*, for example:

- Is the mean of one population equal to a given value? (One sample t-test)
- Are the means of two independent populations equal? (Two sample t-test)

*populations*, a natural question to ask is

*what two populations are we talking about*? Even if we could come up with a sensible answer for this question, the usual purpose for these hypothesis tests is about the comparability of

*the allocated groups*, not

*populations*- something that's beyond the scope of hypothesis testing. If we ever reject the null hypothesis, it's by definition a Type I Error, since both groups are from the exact same population.

Let's set aside the muddled logic behind the tests, and see if they give us some practical value. Suppose the

*p*-value from the resulting test is 0.06: does this mean that the variable is not a confounder? Remember, if the covariate is not related to both treatment

*and*response, it can not be a confounder, and it can be safely ignored. If it is a known confounder from previous studies, we should be adjusting for it anyway,

*regardless*of the degree of balance. Additionally, if we have a large study, we can have a statistically significant result with absolutely no practical significance. The hypothesis test is not only nonsensical, but its results can't even inform us about confounding.

In summary, a statistically significant difference doesn't mean that a variable is confounder, lack of a statistically significant difference doesn't preclude confounding, and any rejection of the null hypothesis is inherently a false positive, or indicative of deep problems with the allocation mechanism- I can't really see the utility of hypothesis testing in randomized trials, even if somehow there were a coherent justification for it.

If a variable is known to be associated with the outcome of interest, it should be included as a covariate in analyses regardless of balance between groups. This is especially important if the response variable (e.g. blood pressure) is measured at baseline, as past history is usually the best predictor of future events.

Don't take my word for it, read it the people who know far more than I do:

- Senn, S.J. Testing for Baseline Balance in Clinical Trials.
*Statistics in medicine***13**, 1715-26 (1994). - Altman, D.G.
Adjustment for Covariate Imbalance.
*Encyclopedia of Biostatistics*1000-1005 (1998). - MÃ¼lllner, M.,
Matthews, H. & Altman, D.G. Reporting on Statistical Methods to Adjust forConfounding: A Cross-Sectional Survey.
*Annals of Internal Medicine***136**, 122-126 (2002). - Senn, S.J. BaselineAdjustment in Longitudinal Studies.
*Encyclopedia of Biostatistics*253-257 (1998). - Schulz, K.F., Altman,
D.G. & Moher, D. CONSORT 2010 statement: updated guidelines for reportingparallel group randomized trials.
*Annals of internal medicine***152**, 726-32 (2010). - Senn, S.J. Base Logic:Tests of Baseline Balance in Randomized Clinical Trials.
*Clinical Research and Regulatory Affairs***12**, 171-182 (1995).

## No comments:

## Post a Comment