By strange coincidence the exchange at this discussion group comes at the time when SIGN (Scottish Intercollegiate Guidelines Network) published "A new system for grading recommendations in evidence-based guidelines". In the paper (see the link http://www.bmj.com/cgi/content/full/323/7308/334) the authors state that they developed validated checklists for the assessment of the quality of evidence. If so, this indeed will represent the major accomplishment. I am CC this message to the lead author of the SIGN paper, who, I hope, will find this exchange stimulating enough to get involved in this crucial debate within EBM movement. looking forward to interesting discussion ben Benjamin Djulbegovic, MD,PhD Associate Professor of Oncology and Medicine H. Lee Moffitt Cancer Center & Research Institute at the University of South Florida Interdisciplinary Oncology Program 12902 Magnolia Drive Tampa, FL 33612 Editor: Evidence-based Oncology http://www.harcourt-international.com/journals/ebon/ e-mail:[log in to unmask] http://www.hsc.usf.edu/~bdjulbeg/ phone:(813)979-7202 fax:(813)979-3071 -- -----Original Message----- From: Doggett, David [mailto:[log in to unmask]] Sent: Monday, August 13, 2001 5:16 PM To: [log in to unmask] Subject: Re: validated instruments for critical appraisal May I interject a word of caution concerning "validated" evidence hierarchies. Over the years we have from time to time looked into the literature on the validity of evidence hierarchies. A related question, for which there is more literature, and upon which the concept of evidence hierarchies depends, is the question of the effect of study design on research outcomes; i.e., whether double-blind RCTs are always necessary, or whether in some situations more convenient study designs are adequate. In general we have always found that the literature shows that the effect of study design on research outcomes is topic specific. Because of this, the search for a universally valid quality rating system appears to be futile. When study design does not correlate with outcome differences, it may be for one of two reasons. In some areas there is so much subjectivity, bias (particularly publication bias) and fraud that the apparently best study designs give results just as flawed as worse study designs. This may be the case in some areas of pseudoscience where research is carried out by proponents. On the other hand, in some research areas there are hard outcomes, and conscientious researchers are sophisticated in research design and data analysis, so that the better study designs may not improve reliability over simpler designs. Some areas of cardiology come to mind here. It is not uncommon in technology assessment to find RCTs that are fatally flawed in terms of internal or external validity, and on the other hand less rigorously controlled studies that are well done and reliable. If study design does not invariably affect research outcomes, then it follows that there can be no universal validation of evidence hierarchies based on study design. In particular, whereas double-blind RCTs are in general more reliable than less rigorous designs, the precise points assigned to various study design aspects by a quality rating system are not universally appropriate, and adjusting or weighting outcomes according to such quality rating scores cannot be justified. Blind belief in these rating scales applied to uncharted areas of research is simply not appropriate. A more reasonable approach is to use heterogeneity analysis to empirically assess whether study design substantially affects outcomes in the particular set of studies at hand. Heterogeneity analysis should not be merely an inspection of heterogeneity test p values, because small sets of studies may not have sufficient statistical power to detect clinically significant differences in results. Regardless of p values, different study designs that give results that appear to have clinically significant differences in outcomes might best be grouped separately. This is not a simple subject. Unfortunately going into our files and putting together a comprehensive bibliography on this subject is beyond my time constraints at the moment. David L. Doggett, Ph.D. Senior Medical Research Analyst Health Technology Assessment and Information Services ECRI, a non-profit health services research organization 5200 Butler Pike Plymouth Meeting, Pennsylvania 19462, U.S.A. Phone: (610) 825-6000 x5509 FAX: (610) 834-1275 http://www.ecri.org e-mail: [log in to unmask] -----Original Message----- From: Gero Langer [mailto:[log in to unmask]] Sent: Monday, August 13, 2001 4:32 AM To: [log in to unmask] Subject: validated instruments for critical appraisal Hello, I am looking for some validated instruments to critically appraise studies. It is important to find the 'best' studies, and a (validated) rating system for all kinds of questions (intervention, diagnosis, qualitative etc.) should be used. Currently I am using the JAMA users' guides, but they are not validated (or?) and comparisons between studies are difficult and subjective. For RCTs I am working with the Jadad score. With all of those I could get a 'result', but not a 'comparing' (e.g. rated) solution. We are developing a database for nurses in Germany and trying to offer the best available evidence for some nursing problems -- but what is unbiassed the best? A scoring system would be very useful... Does anyone know of anything in this field? Thanks in advance, Gero Langer -- Martin Luther University Halle-Wittenberg Institute for Nursing and Health Sciences German Center for Evidence-based Nursing Website: www.EBN-Zentrum.de E-Mail: [log in to unmask]