***********************************
Statistical analysis at BRI Inquiry
***********************************
Dear Dr Poloniecki,
As those who worked on data analysis for the BRI Public Inquiry we
would like to comment on your open letter to the Inquiry team (attached
below). We are sorry for the delay in replying but some coordination
was necessary.
We should first point out that for the Inquiry there was a team who
worked together on the joint analyses and agreed the detailed approach
to be taken, after discussion with a wider group of statisticians who
acted as expert advisors. This letter comes from the analysts, who
take full responsibility for the content of the published reports.
We emphasise that we are not replying on behalf of the Inquiry, and so
cannot comment on either its brief or its possible future
recommendations. It is also important to note that the purposes of the
Inquiry were very different to those of the GMC consideration of
serious professional misconduct, and the Inquiry is not seeking to decide
"on a case to be accepted or rejected". The consequences are that the
statistical approach should match the objectives of the Inquiry. The
analysis of the GMC data needs to be considered entirely separately -
one of us (DJS) was responsible for this analysis.
Your main concerns appear to be the many sources of multiplicity, and
the resulting potential for finding false-positive conclusions. There
are three sources of multiplicity: 1) multiple centres, 2) multiple
operations, and 3) possible multiple looks at accumulating data. We
share these concerns: there could be dangers in a prospective
system for monitoring performance that repeatedly examined accumulating
data, and then identified a centre or surgeon on the basis of
apparently above-average mortality on a single class of operations.
However, we believe that examination of the statistical evidence to the
BRI Inquiry will show that these valid concerns were, where
appropriate, fully taken into account. Examining each source of
multiplicity in turn:
1. Multiplicity of centres: The basis of our analysis was to examine
if the performance of Bristol or any other centre was compatible with
`standard' between-centre variation. All interval estimates of `excess
mortality' (essentially leave-one-out residuals), and all assessed
`probabilities that excess mortality is greater than zero', were based
on a random effects model that explicitly allowed for inevitable
between-centre variability. You are right to say that such variability
must be taken into account: we did so. Furthermore, the entire
analysis was repeated for each centre symmetrically and the results
reported. The conclusions are therefore not based on any selection
procedure.
2. Multiplicity of operations: The analyses considered all operations,
both individually and in combination. Emphasis was placed on the
consistency of results across operation types, and with the
`significance' of overall totals.
3. Multiplicity of looks at the data: We were carrying out a
retrospective analysis of the data, and not making any statement
concerning what might have been monitored at the time. We were not
concerned with a trial-like decision, nor were we dealing with what
might be done in the future. All this is made clear in the reports.
We also made it very clear that our analyses did not provide reasons
for the differences which we found, and the evidence from the clinical
case note review suggests that performance of individual surgeons was
only one factor, and possibly not the most important one, which might
explain the findings.
We again emphasise that we agree with your concerns about how to fairly
identify and act on apparently `divergent' performance in the future.
This is a complex issue, even without considering the vital problems of
case-mix and those of ascertaining non lethal but negative outcomes
through routine data collection systems. For example, if the Inquiry
is going to make recommendations for future monitoring procedures then
there will be a need for careful consideration of repeated significance
testing. Also, it is right that `acceptable' performance is a
clinical judgement and this might also be addressed by the Inquiry.
In conclusion, even casual scrutiny of the reports should show that the
results concerning Bristol's divergent performance are robust enough
to withstand a wide range of analyses and assumptions. We believe the
analysis was carried out in a fair, open and reasonable way. All our
reports are available from
http://www.bristol-inquiry.org.uk/brisDSAnalysis.htm .
Finally, summaries of our reports are being submitted to medical and
statistical journals, which will provide a forum for further
discussion. We therefore regret that we will not contribute to allstat
further correspondence on this issue.
Yours sincerely
Paul Aylin
Nicky Best
Stephen Evans
Gordon Murray
David Spiegelhalter
(Authors of statistical reports for the BRI Inquiry).
---------- Forwarded message ----------
Date: Wed, 10 Nov 1999 13:05:07 +0000 (GMT)
From: Jan Poloniecki <[log in to unmask]>
To: [log in to unmask]
Cc: Maria Shortis - Constructive Dialogue for Clinical Accountability
<[log in to unmask]>,
[log in to unmask]
Subject: Open letter to public inquiry on Bristol Royal Infirmary
Ms Sue Kingswood
The Bristol Royal Infirmary Inquiry
2-10 Temple Way
Bristol BS2 0BY
Dear Sue,
Thank you for your letter of 8th October regarding my comments on Phase
Two of the Inquiry. I would be happy for our correspondence including this
letter to be put on the Inquiry's website on the understanding that it
represents my personal views and not that of any institution.
The statistical conclusions that have been drawn first by the GMC and now
at the BRI Inquiry are fatally flawed by reason of inadequate allowance
for repeated significance testing, and not taking into account the method
by which Bristol was selected for scrutiny [See Reference for more detail
regarding these flaws].
If the Inquiry is to be constructive, it must examine the control
processes, and specifically the GMC. It must consider the process by which
the case was referred to the GMC, and distinguish this from a random
sampling procedure. It must consider the implications of the precedent set
by the GMC's findings that Mr. Wisheart should have stopped operating
after the 12th case. In doing so, it must be considered that no numerical
argument for this conclusion was presented during the trial or in the
judgement.
It must explicitly consider the frequency of testing to be allowed for in
relation to multiple significance testing.
It must explicitly acknowledge that real differences in death rates exist
between operators and between institutions. It must acknowledge that such
differences cannot be removed from the NHS.
It should consider whether the question of what is an acceptable
difference in death rates is capable of a single answer, and that some
differences might be acceptable to some surgeons and some patients but not
necessarily to all patients or all purchasers.
It should consider what is a suitable forum for discussion of this topic,
and who is a competent authority to specify the size of difference that
makes it an offence for a surgeon to continue operating.
Yours sincerely,
Jan
Reference: Half of all doctors are below average. BMJ 1998;316(Jun6):1734-6.
http://www.bmj.com/cgi/content/full/316/7146/1734
[log in to unmask]
tel: +44 (0)181 725 2795 a.m. / 2652 p.m. (0)181 672 4122 p.m. answering machine
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|