Determining 'What Works' - An Interview With Dr. Grover 'Russ' Whitehurst
Much of the discussion in the news, schools and statehouses about the No Child Left Behind Act has focused on testing and accountability, and whether there is enough money to fund the act sufficiently. However, bubbling underneath this publicity is a term that has the potential to cause a minor ripple on the surface or to erupt like an underwater volcano and create enormous waves: Scientifically Based Research. This term appears more than 160 times in NCLB, but the meaning and application of SBR may vary due to the context in which it appears in NCLB, the program (e.g., Title I, Title II D) within the bill, or the type of money (formula or competitive grants) being used to fund the program.
For the next six months, T.H.E. Journal will be running a series, co-edited by Therese Mageau, on scientifically based research. This first article is an interview with Dr. Grover J. "Russ" Whitehurst, the director of the U.S. Department of Education's Institute of Education Sciences, which oversees a key component in SBR, the What Works Clearinghouse. Future articles will address what SBR really means, whether SBR is a guideline or a mandate, a checklist for educators to help in evaluating research, a review of the research agenda for educational technology, and an analysis of the challenges that the technology industry faces in addressing SBR.
We hope this new series, titled "A Closer Look at Scientifically Based Research," helps you as you confront guidelines from your state department of education and listen to vendors describe their research programs in support of their products.
The What Works Clearinghouse (wwc) was set up by the U.S. Department of Education's Institute of Education Sciences (IES) to provide educators, policy-makers and the public with reviews of research on educational interventions and to make a determination about "the scientific evidence of what works in education." The Clearinghouse is mistakenly thought of by some to be the final federal arbiter on what qualifies as scientifically based research (SBR). In fact, it has a highly defined scope of work, focusing on about a half-dozen topics in curriculum and instruction each year where there is research on the effectiveness of various educational interventions. The Clearinghouse's job is to determine and report back to the educational community on the soundness of that research, as well as to give educators guidance in making decisions about programs, practices, products and policies that improve student outcomes. Dr. Grover J. "Russ" Whitehurst is the director of IES and oversees the Clearinghouse. He spoke with T.H.E. Journal about the mission and function of the What Works Clearinghouse, as well as the role of SBR in education.
T.H.E. Journal: Describe the work of the What Works Clearinghouse.
WHITEHURST: The work of the Clearinghouse is to provide an instrument that can be used by people, such as readers of T.H.E. Journal, which will provide them with such information, as is available, that's relevant to the decisions they have to make when they purchase technology, or a curriculum or a professional development model. It's to provide a well-respected source of information with regards to what the science says, and what evaluation says, about which programs work for whom.
T.H.E.: How d'es the Clearinghouse's work relate to the definition of scientifically based research as defined in the No Child Left Behind act?
WHITEHURST: Certainly a [research] study that meets the standards of the What Works Clearinghouse would be consistent with [NCLB's] definition of scientifically based research. However, the question addressed by the What Works Clearinghouse is a particular question, and that's 'what works'? There are many scientific questions that are not 'what works' questions. There are, for example, questions of what correlates with what: are poor people less likely to get appropriate funding for their schools than people from affluent neighborhoods? There are questions having to do with the validity of a test or assessment instrument. There are questions having to do with the way the brain works as it processes what an individual reads. All of those are questions that are addressed by science and fit the [NCLB's] broad definition of scientifically based research.
But the What Works Clearinghouse is only addressing one of those questions, and that's the question of effectiveness. And the effectiveness question requires a particular subset of standards that are much narrower than the general standards for scientifically based research. So a standard of the What Works Clearinghouse is that it privileges, or gives preference to, randomized trials because randomized trials are the gold standard for determining effectiveness.
T.H.E.: The Clearinghouse looks at research in a given topic area on the effectiveness of educational interventions, which are defined by the WWC as a product, practice, policy or program — what your site refers to as the 'four Ps.' Would you consider just one research study on a particular P or d'es there have to be a body of studies?
WHITEHURST: We would consider one study. In most cases, there will not be a body of research for a single P. If you look at the U.S. Food and Drug Administration as a model, what's required to get FDA approval to market a product are two randomized trials. So, we will privilege randomized trials. We will provide a registry of those trials as related to particular Ps. If there is only one such trial, there will be appropriate warnings and caveats that this is just one study and every study has limitations. Those limitations will be described in the evidence reports (see Pages 36-37 for more on evidence reports). But we think it is very important to provide the data that exists with respect to particular Ps. The body of evidence that you refer to is one that exists with respect to a body of related Ps.
T.H.E.: A body of evidence would be especially true for practices and policies, but not so much for products?
WHITEHURST: Well, it will be [true] in products as well. That is, there will be a variety of products that focus on middle school math. One can imagine an evidence report that focuses on what might have been common to the products that seemed to be more successful versus less successful. That's what the body of evidence would generate. At the same time, a local school administrator would want to know whether there is any evidence with regard to the effectiveness of say, 'Everyday Math,' and we intend to provide that evidence, which is likely to[come from only] one or two studies.
T.H.E.: OK, let's say that the body of evidence finds that one of the common successful elements in a group of elementary math products was the use of manipulatives. I am in a district and use manipulatives in my elementary math instruction. D'es that mean I'm doing 'what works' or that I'm using a product that has a manipulatives component? What d'es that mean from the viewpoint of the WWC?
WHITEHURST: That's precisely why I said we are going to have a focus on individual studies that examine the effectiveness of particular products. Because the issue you raise is a very important issue that will arise if all we have are these broad evidence reports. [The fact] that there were particular products that involve manipulatives that work d'es not mean that manipulatives work as the local district is using them. We'd like to provide, to the extent possible, more detailed information at the level of the particular practice that a district might be using so the district can determine if there's evidence on whether that practice works.
T.H.E.: In other words, the body of evidence says that the use of math manipulatives seems to be effective based on this group of studies, and a particular product study evaluation would say, for example ...
WHITEHURST: ... that 'Everyday Math' works. And one of the characteristics of 'Everyday Math,' as described by the developers, is that it involves manipulatives. At that level, you are not saying that manipulatives work. You are saying there's evidence that when 'Everyday Math' was employed in real schools there was a sign of positive impact on student scores.
T.H.E.: If I am using a product that d'es not show up in your evidence report, but it uses manipulatives, how can I know for sure that it works?
WHITEHURST: You can't.
T.H.E.: You can't?
WHITEHURST: [Not] unless you collect the evidence yourself.
T.H.E.: How do you do that?
WHITEHURST: We think it's not only possible but highly desirable for school districts to collect performance data on any P they choose, whether evidence of its effectiveness is validated by the What Works Clearinghouse or not. Because an educational intervention is not the same thing as a pharmaceutical, something may work in one location, and circumstances may be so different in another school system that it might not work.
T.H.E.: I am assuming that the districts would be held to the same rigorous standards for SBR that everyone else is held to?
WHITEHURST: No, I am suggesting that there are two prongs to evidence-based practice*. One is relying on well-designed, scientific studies of effectiveness of the sort that will be vetted on the What Works Clearinghouse. The other is locally collected performance data that indicates whether changes are occurring in the desired direction when a particular program or practice is implemented. That's something that schools should do; it's not that hard to do. You've got a system in place for collecting performance data and you should do that.
T.H.E.: Well, the sticky wicket here is what gets funded. If I am a district looking specifically at underperforming students and am using Title I monies to address their learning needs, can I do my own local study and say, 'Look I've proved that this works so I can use our Title I funds for this particular program,' or am I going to be limited to those Ps that the WWC has vetted?
WHITEHURST: I can't speak definitively to the regulations that the Office of Elementary and Secondary Education will deploy with respect to Title I funds. That's a separate office of the department. I can say conceptually what I believe the department should be doing as it makes discretionary grants to school districts, [which] is, if there is a list of products or practices that have strong evidence of effectiveness based on the What Works Clearinghouse, that there would be an expectation that grantees would choose from the list. Or, if they are using a product or practice that hasn't been evaluated — that's innovative, that's new, that may be homegrown — they produce along with their application for funding a plan for locally evaluating the effectiveness of that product.
Given that there's nothing about the proposed innovation that seems bizarre or unlikely to work, that it seems promising but there's just no evidence of its effectiveness, then either of those strategies should be successful. So, either you choose something that's been shown to work, or you do something new but promise to collect evidence to demonstrate whether it's effective or not.
T.H.E.: This is something you're saying you think conceptually should or might happen, not something that is happening?
WHITEHURST: Yes, because we don't have a What Works Clearinghouse that has products yet. The department's funding for Title I, Reading First and other programs occurs largely in the absence of any evidence of the sort that the What Works Clearinghouse will produce. So, the judgments to date have been largely based on the requirements of the legislation itself, and whether the state or the local education agency is proposing to do something that is generally aligned with whatever is required by the legislation. So it's a very different process that's more so based on alignment. The best example of that is Reading First, where basically to get your plan approved, you've got to demonstrate at the state level that you have a process in place by which school districts that are going to get Reading First funds will be vetted to determine if the programs they are going to be using will focus on fluency, phonics and the [other] required components of Reading First.
T.H.E.: So a federal program, like Reading First, right now relies on districts aligning their programs with the components of reading instruction that it has identified as effective, rather than districts having access to a 'what works' analysis of the research on specific reading interventions they might want to use?
WHITEHURST: Yes, that's right. By and large, alignment is all we've had, because in most cases [there's been an] absence of any definitive research on what's effective, and in the other cases it's just the pure need to cover the wide variety of practices that are out there.
T.H.E.: You said that your expectation is that 'grantees would choose from the list.' But the WWC says very clearly on its Web site it d'esn't have an approved list.
WHITEHURST: Well, 'list' is not the right word. You will at some point be able to go into the Clearinghouse Web site and click on something like 'Everyday Math' or the Houghton Mifflin basal reading series, and you'll be able to find whether there is any strong evidence to support positive effects of the use of that program. In some sense, one could construct a list by identifying all the Ps that have any evidence of effectiveness associated with them. But you will not find that list generated by the What Works Clearinghouse.
What the Clearinghouse is doing is providing information, and the information can be used either voluntarily by school districts or education decision-makers to increase the likelihood that they will make the right decision about [an instructional program]. It may also be used by other components of the U.S. Department of Education in regulating what people have to do to get funds. But that's a decision that certainly will be made down the road by [the Office of] Elementary and Secondary Education, or the Office of Innovation and Improvement, or whichever office is doing this work. They will make their own decisions about how to use the evidence from the What Works Clearinghouse, just as states and localities will.
T.H.E.: Do you think that school districts are going to be in a position where they are going to have to either turn to the WWC database or do their own research on any product or program that they use?
WHITEHURST: No. I think that should that day come, it is centuries from now. I think that the level of evidence, the requirements in terms of the amount of evidence that's necessary, will be directly proportional to the potential impact of the decision. So, when you're making a decision on the reading curriculum of every kid in New York City, that's a decision that you'd want to be embedded in as much evidence as possible, where you set a relatively high bar for the evidence of effectiveness that will be required.
The only situation, I think, in which evidence is going to be required for that sort of decision [i.e., smaller, supplemental product purchases] is if the marketplace drives decisions in that direction. One could imagine that a vendor of a reading supplement might go out and do a well-designed study demonstrating that, in fact, reading comprehension g'es up when this supplement is used, and that would give that vendor some marketing advantage in going to a school district.
T.H.E.: In fact, that is one of the concerns among vendors about SBR: That even when funding is not connected to the WWC evidence reports, the market will start favoring the larger publishers who can afford to do those studies.
WHITEHURST: I'm not sure. Two things: First, it will not be the U.S. Department of Education that drives decisions at that level. That will be, again, the marketplace and decision-making at the local level. And I don't actually find anything objectionable about that at all. If I'm the local principal and I've got to make a decision about whether to go with this reading supplement versus that reading supplement, if there's one that's got some evidence that when deployed in a real-life situation it actually improves comprehension scores, why wouldn't I want to have that information.
The second point I would raise is that the idea that it's really expensive to develop evidence of effectiveness, I think, is wrong. I've had people come through my office and show me the evidence that they've collected on the effectiveness of their products, and in many cases these have been very small firms that have done an evaluation for $60,000. Now, it's true that if the total income for the vendor is $200,000 a year, then spending $60,000 a study is not practical. But, I think collecting evidence on effectiveness is not a huge business expense, except at the very lowest end. And usually there you're talking about brand new products, innovations where there is nothing else that's comparable, so you're not up against competitive pressure that somebody else has got the same widget that you've got and they've got evidence that you don't.
T.H.E.: In NCLB's definition of SBR, it says that research must be published in a peer-reviewed journal. D'es the WWC have the same requirement?
T.H.E.: If the organization that conducted the research was hired by the publisher, d'es that diminish the research's validity in the eyes of the WWC?
WHITEHURST: Not necessarily, though independence is good.
T.H.E.: If a vendor claims that their product meets SBR standards, how d'es a school district know if that's true?
WHITEHURST: They go to the What Works Clearinghouse, click on that product and find out what the evidence report, what the study report, says.
T.H.E.: And if there is none there?
WHITEHURST: Then the vendor is lying.
T.H.E.: I hate to push you on this, but this is working as an approved list. It's functioning that way even if it isn't officially one.
WHITEHURST: No, it's not an approved list. That is, there is nobody at the U.S. Department of Education who is requiring that districts makes choices off products that appear with evidence of effectiveness in the What Works Clearinghouse.
T.H.E.: But if funding is tied to them, then they are.
WHITEHURST: Well, we don't even have any products yet, [so] how could they be making those decisions.
T.H.E.: But eventually you're going to be having those products?
WHITEHURST: But there's no regulation or guidance that's come out of the U.S. Department of Education's offices that provide funding to states and local education agencies that has said, 'If you get money here, you have to pick a P that has evidence of effectiveness that's listed in the What Works Clearinghouse.'
T.H.E.: Do you think that day will come?
WHITEHURST: I think that — to the extent that the What Works Clearinghouse provides a timely, user-friendly and generally unimpeachable evaluation of the evidence — increasingly people will use it, including the U.S. Department of Education. But in some sense, the Institute of Education Sciences is a vendor. We're developing a product here. If it is useful, if it is powerful, then we will expect people will use it. But we don't have any promises that people will use it. We haven't already premarketed it, and we haven't sold it to other offices or programs. Sure, I think it will go that way, but it depends on how well the What Works Clearinghouse d'es its job. We don't have a product yet.
T.H.E.: Let me ask you specifically about Title II D. It's my understanding that there's no SBR requirement in Title II D.
WHITEHURST: That's correct, as I recall.
T.H.E.: So, if a technology product is a particular application in a curriculum area, it would be held accountable to meet SBR standards established in that area. And theoretically, if it can't be bought with Title I funds because it d'esn't meet SBR standards, it might be able to be bought with Title II D funds.
WHITEHURST: Might be. I think it depends not only on the title of NCLB that the funds are deriving from, but also on the particular regulation and guidance that's employed by the component of the U.S. Department of Education that's either judging the block grants from the states or the applications from individual school districts. Eventually, one imagines that every program office in the department will be on the same page; it takes awhile to get there. So [for] a technology program, for example, that's being used as a component of Reading First, one would expect that at some point there's going to be a requirement that there be evidence of effectiveness of the sort that meets the standards of the What Works Clearinghouse. For other types of use of technology, it's hard to imagine what that evidence would even look like. [If you use] technology money to buy servers, what d'es that mean that they're 'effective'?
T.H.E.: They don't crash?
WHITEHURST: [Laughs] Fair enough. But there's a lot of investment that can be thought of as infrastructure investment.
T.H.E.: A typical way in which districts purchase technology is that they use Title I funds to buy computer labs for underperforming kids to get basic skills practice. They use Title I money to buy the equipment, the software and pay for the salaries of the paraprofessionals who oversee the labs. Given that scenario, what part of that do you think, or could you imagine, would be covered by SBR?
WHITEHURST: Software — the application. If the What Works Clearinghouse produces products that are useful in the ways that I've previously described, there will increasingly be pressure on the vendors of the software applications to demonstrate that they actually enhance children's skills in the ways that are claimed. That pressure, I think, is going to come from a variety of sources, and it's as likely to come at the local level and from the state level as it is from the federal level.
T.H.E.: What about the practice of having computer labs and using paraprofessionals?
WHITEHURST: I think those are very researchable questions. [But] it's more difficult for me to think of a funding mechanism or a funding decision that flows from the U.S. Department of Education that's going to have a direct impact on that.
* According to www.ed.gov/admins/tchrqual/evidence/whitehurst.ppt, evidence-based practice is the integration of professional wisdom with the best available empirical evidence in making decisions about how to deliver instruction.
How a What Works Clearinghouse Evidence Report Is Created
The end products of the What Works Clearinghouse are evidence reports, which are designed, according to the WWC, "to provide education consumers with high-quality reviews of scientific evidence of the effectiveness of educational interventions." The WWC would like to see educators use evidence reports to make instructional decisions and education policy. The reports will be publicly available on the WWC Web site (http://w-w-c.org). The first evidence reports were due in fall 2003; however, the WWC decided to pilot test the first reports and extended the initial release date until early 2004. The process for producing an evidence report is described below.
1. A topic area is chosen by the WWC. A topic area is defined by the intended outcome (e.g., improving literacy skills), the intended population (e.g., free- or reduced-lunch elementary students) and the types of replicable interventions (e.g., a literacy-building product or program) that may produce the intended outcome for that population. Topic areas are chosen by considering the potential of interventions in the topic area to improve student outcomes, the perceived demand within the education community for evidence of effective educational interventions in the topic area, and the likely availability of high-quality scientific studies of effective educational interventions in the topic area. Examples of initial topic areas currently being reviewed include: Interventions for Beginning Reading, Curriculum-Based Interventions for Increasing K-12 Math Achievement, Programs for Preventing High School Dropout, and Programs for Increasing Adult Literacy.
2. An evidence team — consisting of a senior content and a methodology expert, as well as a project coordinator and research reviewers — creates a protocol that sets the parameters for the kinds of interventions they will look at within a given topic area. So, for instance, for the topic area of improving literacy, the protocol might define such parameters as grade-level scope, the extent of the intervention, a definition of literacy skills, and so forth.
3. Once a protocol has been defined, the evidence team begins a literature search for studies on interventions in the topic area that fits the parameters of the protocol. Publishers, organizations, educators and others can nominate studies of replicable interventions through the WWC Web site at http://w-w-c.org/topicnom.html. An initial sort eliminates those interventions that do not fit the protocol parameters. For example, if the protocol defines literacy skills as including decoding, a study of an intervention that d'es not address decoding would likely not be considered.
4. The remaining intervention studies are then submitted to a series of questions about the design of the intervention study, such as which methodology was used (e.g., random assignment, matched control group), the number of participants involved, outcome variables tested (e.g., attendance, test scores), and so forth. Studies are dropped from consideration if the examination (called a "DIAD") determines that their design d'es not meet WWC standards. In general, those studies that do not use an experimental or quasi-experimental methodology will not pass the DIAD examination. For instance, if a study of the effectiveness of a particular literacy intervention used pretests and posttests to determine effectiveness, that study would likely not be eligible for further consideration.
5. Those studies whose designs have passed successfully through the DIAD are then included in the WWC evidence reports, and the evidence team determines the strength of the evidence of effectiveness that has been provided. The WWC notes that a study can be well designed, but their "rigorous" process may find the effects of the intervention to be small, nonexistent or even negative. In addition, since the WWC d'es not conduct its own field research, it may report that it is unable to reach a conclusion regarding a particular program's effectiveness, due to a lack of adequate studies to review.
6. There are three levels of evidence reports that may or may not be generated in a given topic area, depending on the number of studies that are looked at. The topic report would provide an overview of the evidence for all studies that meet the WWC standards for all interventions for a particular topic area, such as, "Interventions for Beginning Reading." A study report would look at one study on "Success For All," for example. And an intervention-level report would focus on all the available evidence of all studies meeting the WWC standards that focus on a particular intervention like "Success For All."
7. Throughout the review process, evidence reports are subjected to a number of evaluations and reviews from a Technical Advisory Group (experts in research design, program evaluation and research synthesis) and anonymous peer reviewers. The Department of Education will review the final reports to ensure that pre-established processes and standards have been applied to the overall process. Pursuant to final approval, evidence reports are published in database form on the WWC Web site.
8. It is the intention of the WWC to regularly update an evidence report with new studies of interventions submitted after initial publication. For example, the publisher of a literacy product that is released after the publication of the evidence report on that topic, may submit effectiveness studies on its product to the WWC for review and possible inclusion in the next report update.
This article originally appeared in the 01/01/2004 issue of THE Journal.