The CEBM criteria are the most prominently used among the core group of EBMers. They are very detailed. The SORT criteria are more recently introduced (February 2004) and have the advantages (and disadvantages) of simplicity. For practicing clinicians, the SORT criteria seem easier to interpret. To facilitate interpretation of level of evidence grading by practicing clinicians who may not take the time to read about the underlying rules, the DynaMed Editors chose to use the SORT criteria and add brief phrasing: level 1 (likely reliable) evidence level 2 (mid-level) evidence level 3 (lacking direct) evidence grade A recommendation (consistent high-quality evidence) grade B recommendation (inconsistent or limited evidence) grade C recommendation (lacking direct evidence) We have not formally studied the result, but it appears to be going well. From the perspective of processing information for a clinical reference, distinguishing evidence as level 1 (likely reliable) vs. level 2 (mid-level) appears more useful than purely distinguishing evidence by study type. In this model, level 1 labels require the best study type within a category (e.g. randomized trial for treatment, inception cohort study for prognosis) PLUS meeting a set of quality criteria for that study type. Other rating systems typically require quality criteria for randomized trials to get the level 1 rating, but the SORT criteria provide more details for this than some other systems. The Delfini system is an excellent system as well. We chose SORT in part because of the potential for wide acceptance, as it was created by mutiple leading journals in family medicine in the US working together and agreeing to use it. There are other approaches (such as the efforts of the GRADE working group) trying to collaboratively develop the "standard" for a level of evidence system for many to use, but these efforts have to deal with the tensions between using a small number of levels vs. a large number of levels, and exactness/detail-level vs. simplicity. In addition, different frames for what is being measured (studies vs. collections of studies vs. recommendations) and different target audiences (practicing clinicians vs. researchers vs. guideline developers) complicate the decision-making for choosing an ideal level of evidence rating system. Getting back to the original question for this post, regarding how to rate a systematic review with one randomized trial and many non-randomized studies, here are some additional considerations: The rules may vary with different labeling systems. A system could allow a systematic review which includes a randomized trial to get a level 1 rating (or whatever the highest rating is in that system), but this could be misleading if applied indiscriminately. The level of evidence would most accurately be applied if based on the outcome and the data for that outcome---this could results in different levels of evidence being reported for different outcomes mention in the same systematic review. A systematic review that covers high-quality and low-quality evidence, and finds consistent high-quality evidence to support an outcome, could appropriately give that outcome the highest level of evidence rating. If the support for an outcome is completely based on low-quality evidence, then the level of evidence should not be the highest rating, regardless of the quantity of studies involvved. If the support for an outcome is completely based on low-quality evidence, then the level of evidence should not be the highest rating, regardless of the quality of the systematic review. A systematic review could be very high quality but find limited evidence. The quality of the systematic review cannot change the quality of the underlying evidence. Brian S. Alper MD, MSPH Editor-in-Chief, DynaMed (http://www.DynamicMedical.com) Founder and Medical Director, Dynamic Medical Information Systems, LLC 3610 Buttonwood Drive, Suite 200 Columbia, MO 65201 (573) 886-8907 fax (573) 886-8901 home (573) 447-0705 "It only takes a pebble to start an avalanche."