I agree with Steve Simon that sometimes reliable adjustments for covariates
can be made. I don't mean to be a statistical pessimist in all cases.
However, essentially all small RCTs present problems, notwithstanding their
RCT status. For small controlled trials, the whole concept of doing
statistical tests to "prove" that the observed imbalances in patient
characteristics are unimportant is flawed from the outset. The smaller the
trial (and thus more unreliable) the larger is the imbalance that can escape
being found "statistically significant", simply because of the lack of
power. The smaller the trial, the less the power for detecting these
imbalances with statistical significance. Such "proof" of a lack of
imbalance is a statistical howler, and reassures the reader of nothing, yet
it occurs with great regularity and with a completely straight face in many
controlled trial reports.
The more subtle issue is that the degree of imbalance (regardless of
statistical significance) does not translate directly into knowledge of the
strength of influence of the prognostic factor on the outcome of interest.
A factor with little influence on the outcome can be wildly unbalanced with
no appreciable effect on the outcome. A factor with stong influence can
seriously skew the outcome with only a very slight imbalance. If the
strength of the influence can be determined from multivariate analysis
(rarely possible in small trials) or from other information, it can be
adjusted for, as Steve Simon indicates.
The lesson is that RCTs, especially small ones, cannot be given blind trust.
I hope that is healthy skepticism rather than nihilism.
David L. Doggett, Ph.D.
Medical Research Analyst
Technology Assessment Group
ECRI, a non-profit health services research organization
5200 Butler Pike
Plymouth Meeting, PA 19462-1298, USA
Phone: +1 (610) 825-6000 ext.5509
Fax: +1(610) 834-1275
E-mail: [log in to unmask]
> -----Original Message-----
> From: Guthrie, Dr Bruce [SMTP:[log in to unmask]]
> Sent: Tuesday, December 14, 1999 5:29 AM
> To: [log in to unmask]
> Cc: [log in to unmask]
> Subject: Re: Significance in differences in patient characteristics
>
> I was under the impression that it makes no sense to talk about
> statistically significant differences in groups generated by
> randomisation. If they are truly randomised, then by chance 1 in 20
> of characteristics compared will be "statistically significant" (the
> more characteristics you measure to demonstrate how comparable your
> groups are, the more likely you are to find an imbalance that people
> like me can criticise?). I now wonder if this is correct?
>
> If it is, then the only thing that matters is whether you think the
> observed differences are likely to make a clinical difference to the
> outcome. If you do, then I've certainly seen multivariate analysis
> used to try to correct any unfortunate imbalances that randomisation
> has produced, but it is a post hoc attempt to get around the problem.
> I personally find it difficult to judge how 'good' post hoc
> multivariate adjustments are and tend to view trials with big
> imbalances as less reliable/useful no matter what adjustments have
> been made , but am I being unfair?
>
> Bruce
>
> > Brent Beasley has raised a valid concern about a very wide-spread abuse
> of
> > statistics in reporting controlled trials. Differences in the
> frequencies
> > of occurence of certain patient characteristics (prognostic factors)
> between
> > the arms post-randomization, whether statistically significant or not,
> do
> > not tell us the strength of the influence of the factors on the final
> > primary effect being measured by the trial. One need only imagine that
> some
> > factor with negligible effect on the outcome might be very unbalanced in
> the
> > arms with great statistical significance, but still have little or no
> effect
> > on the outcome; conversely, another factor may be present in only a
> small
> > number of patients (with no statistically significant frequency
> difference
> > between the arms) but may have such a strong influence on the outcome
> that
> > even a small imbalance will confound the results. In other words a few
> > extra patient characteristic outliers in one arm my completely skew the
> > average outcome measured by the trial. Even if the statistical
> significance
> > of the frequency difference were informative, failure to find
> statistical
> > significance may be meaningless if there is low statistical power for
> that
> > particular patient characteristic (few patients in either arm have the
> > characteristic).
> >
> > This is especially a problem with small trials, where patient
> characteristic
> > imbalances are common. Unfortunately, all of the remedies require more
> > patients. There are three common remedies. The most simple solution is
> to
> > have a very large trial so that the randomization will have a chance to
> > balance out all of the known and unknown prognostic factors. Any
> noticeable
> > and worrisome imbalance, whether statistically significant or not, is
> > evidence that the trial is too small. Randomization is not magic. It
> > requires large numbers to balance things out, just as flipping a coin or
> > rolling dice requires many throws for the results to approach the
> expected
> > long-run probabilities.
> >
> > Another remedy is to stratify results according the known major
> prognostic
> > factors. Unfortunately, if there are more than one or two such factors,
> one
> > will be stratifying the stratifications until the individual cells
> contain
> > very few patients and have inadequate power. Thus, again, this will
> require
> > a much larger trial.
> >
> > The real way to deal with the problem is by multivariate analysis.
> After
> > all, most problems in biology and medicine are decidedly multivariate.
> > Acknowledge this at the outset and use multivariate methods that will
> > compare the effect on the outcome of interest of any number of input
> > variables. However, as above, fragmenting the analysis into several
> > variables will require a fairly large study to provide adequate
> statistical
> > power for all the variables.
> >
> > A final remedy can be used in those rare occassions for which the
> strength
> > of the influence of the factor(s) is known. For example one may know
> that
> > for every 1% increase in the proportion of patients with characteristic
> A
> > there is a 3% increase in the outcome being measured. In this
> situation,
> > one can use this knowledge to correct for any post-randomization
> imbalance
> > in characteristic A.
> >
> > The whole point of this problem is that this is a situation in which
> > statistical significance is meaningless, everything hinges on clinical
> > significance. Randomization is intended to balance the unknown
> variables
> > (given a trial of adequate size), but there is always the obligation to
> > report and examine closely the known variables. Any observed
> imbalances,
> > whether statistically significant or not, must be carefully judged for
> > clinical significance. Small RCTs are especially suspect.
> >
> > I have a few references concerning this problem. I have not searched
> > systematically; there may be more. If anyone knows further references
> > please send me the citations. This is one of the most widely
> misunderstood
> > problems in the literature.
> >
> > Simon R, Patient heterogeneity in clinical trials; Cancer Treatment
> Reports
> > (1980) 64:405-10.
> >
> > Altman DG, Comparability of randomised groups; The Statistician (1985)
> > 34:125-36.
> >
> > Sylvester R, Design and analysis of prostate cancer trials; Acta
> Urologica
> > Belgia (1994) 62:23-9.
> >
> >
> >
> > David L. Doggett, Ph.D.
> > Medical Research Analyst
> > Technology Assessment Group
> > ECRI, a non-profit health services research organization
> > 5200 Butler Pike
> > Plymouth Meeting, PA 19462-1298, USA
> > Phone: +1 (610) 825-6000 ext.5509
> > Fax: +1(610) 834-1275
> > E-mail: [log in to unmask]
> >
> > Original message:
> >
> > In reading NEJM's article by Poldermans et al (The Effect of Bisoprolol
> on
> > Perioperative Mortality and Myocardial Infarction in
> > High-Risk Patients Undergoing Vascular Surgery, December 9, 1999 -- Vol.
> > 341, No. 240), I noticed something that has struck me before in
> randomized
> > trials.
> >
> > There were 50-60 patients in each arm of the placebo controlled trial.
> This
> > was enough patients to show a statistically significant difference in
> their
> > endpoint (cardiac death and nonfatal MI). BUT, in the characteristics
> of
> > patients who began the study, more patients in the standard-care group
> had
> > "limited exercise capacity" (43% vs 27%). Although to me this
> difference
> > appears "clinically" significant, it did not reach statistical
> significance
> > because of the relatively small number in each group.
> >
> > It would seem that forethought should be done in sample size
> calculations to
> > avoid having a "clinically" important difference between groups.
> >
> > Has this been discussed somewhere before?
> >
> > Brent
> >
> > Brent W. Beasley, M.D.
> > Assistant Professor
> > Department of Internal Medicine
> > University of Kansas School of Medicine--Wichita
> > 1010 N. Kansas
> > Wichita, KS 67214
> >
> > [log in to unmask]
> > pho: 316-293-2650
> > fax: 316-293-1878
> >
> >
>
> Bruce Guthrie,
> MRC Training Fellow in Health Services Research,
> Department of General Practice,
> University of Edinburgh,
> 20 West Richmond Street,
> Edinburgh EH8 9DX
> Tel 0131 650 9237
> e-mail [log in to unmask]
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|