Checklist to Use in Evaluating Whether an Intervention Is Backed by Rigorous Evidence
From Appendix B of the U.S. Department of Education's "Identifying and Implementing Educational Practices Supported by Rigorous Evidence: A User Friendly Guide."
Step 1. Is the intervention supported by "strong" evidence of effectiveness
A. The quality of evidence needed to establish "strong" evidence: randomized controlled trials that are well-designed and implemented. The following are key items to look for in assessing whether a trial is well-designed and implemented.
Key items to look for in the study's description of the intervention and the random assignment process:
- The study should clearly describe the intervention, including: (i) who administered it, who received it, and what it cost; (ii) how the intervention differed from what the control group received; and (iii) the logic of how the intervention is supposed to affect outcomes.
- Be alert to any indication that the random assignment process may have been compromised.
- The study should provide data showing that there are no systematic differences between the intervention and control groups prior to the intervention.
Key items to look for in the study's collection of outcome data:
- The study should use outcome measures that are "valid" — i.e., that accurately measure the true outcomes that the intervention is designed to affect.
- The percent of study participants that the study has lost track of when collecting outcome data should be small, and should not differ between the intervention and control groups.
- The study should collect and report outcome data even for those members of the intervention group who do not participate in or complete the intervention.
- The study should preferably obtain data on long-term outcomes of the intervention, so that you can judge whether the intervention's effects were sustained over time.
Key items to look for in the study's reporting of results:
- If the study makes a claim that the intervention is effective, it should report (i) the size of the effect, and (ii) statistical tests showing the effect is unlikely to be the result of chance.
- A study's claim that the intervention's effect on a subgroup (e.g., Hispanic students) is different than its effect on the overall population in the study should be treated with caution.
- The study should report the intervention's effects on all the outcomes that the study measured, not just those for which there is a positive effect.
B. Quantity of evidence needed to establish "strong" evidence of effectiveness.
- The intervention should be demonstrated effective, through well-designed randomized controlled trials, in more than one site of implementation;
- These sites should be typical school or community settings, such as public school classrooms taught by regular teachers; and
- The trials should demonstrate the intervention's effectiveness in school settings similar to yours, before you can be confident it will work in your schools/classrooms.
Step 2. If the intervention is not supported by "strong" evidence, is it nevertheless supported by "possible" evidence of effectiveness?
This is a judgment call that depends, for example, on the extent of the flaws in the randomized trials of the intervention and the quality of any nonrandomized studies that have been done. The following are a few factors to consider in making these judgments.
A. Circumstances in which a comparison-group study can constitute "possible" evidence:
- The study's intervention and comparison groups should be very closely matched in academic achievement levels, demographics, and other characteristics prior to the intervention.
- The comparison group should not be comprised of individuals who had the option to participate in the intervention but declined.
- The study should preferably choose the intervention/comparison groups and outcome measures "prospectively" — i.e., before the intervention is administered.
- The study should meet the checklist items listed above for a well-designed randomized controlled trial (other than the item concerning the random assignment process). That is, the study should use valid outcome measures, report tests for statistical significance, and so on.
B. Studies that do not meet the threshold for "possible" evidence of effectiveness include: (i) pre-post studies; (ii) comparison-group studies in which the intervention and comparison groups are not well-matched; and (iii) "meta-analyses" that combine the results of individual studies which do not themselves meet the threshold for "possible" evidence.
Step 3. If the intervention is backed by neither "strong" nor "possible" evidence, one may conclude that it is not supported by meaningful evidence of effectiveness.
This article originally appeared in the 04/01/2004 issue of THE Journal.