Standardized Testing | Analysis
Rampant Errors Call into Question the Value of High-Stakes Testing
- By Dian Schaffhauser
A series of articles being published by the Atlanta Journal-Constitution is questioning the validity of high stakes tests given by states, calling breakdowns in quality control "near commonplace," even as test results affect whether or not students are able to graduate or move to the next grade. Reporter Heather Vogell spent a year analyzing documents from government agencies and other sources on testing and dissecting the data generated on each of almost 93,000 test questions asked of students around the country. Her conclusion: "Vulnerabilities" exist at "every step of the testing process."
The first stories in a three-part series, "Errors plague testing in public schools; Consequences for students can be dire," and "Flawed Questions Fluster Students," appeared in the Sept. 15, 2013 edition of the newspaper. These focused on question development and how problems would surface during test item development. Two additional articles by Vogell will appear over the next week. Part 2 will focus on scoring problems and issues with oversight experienced by states. Part 3 is a "little bit more forward looking," she said, covering what's going on with the Common Core tests and providing a sense of "the challenges those new tests are going to face."
Errors That Affect Kids
The kinds of testing problems cited by Vogell include exam booklets that were missing pages; scanners that malfunctioned; "nonsensical" questions that confused students or questions that had no right answer or too many right answers.
Although a litany of snags exist in state education testing, the emphasis of Vogell's research centered on errors "that affected kids' scores directly." Those could be "computer programming glitches" or errors made by psychometricians as they converted raw scores into final scores. The latter, she said, has "happened a bunch of times with disastrous results in terms of the number of kids that were affected." Those types of problems also ended up being "very publicly embarrassing" because often the testing authorities wouldn't recognize the mistakes until scores were out. "And it's hard to take those back."
Vogell found that testing providers — those companies and organizations that develop and deliver the tests — while acknowledging the problems, also point to pressures by states and the federal government to report scores as quickly as possible, leaving little time to sort out and fix blunders.
On top of that, reported Vogell, states have been hard-pressed to fund the level of professional staffing needed to oversee the tests and the testing processes, often leaving "contractors to police themselves."
The impacts of these problems on students "have never been greater," Vogell noted. "Tests drive decisions about who wins a scholarship, enters a coveted gifted program, attends a magnet school, or moves to the next grade. Teachers and principals lose their jobs because of bad scores. A school tarred with them can attract a state takeover."
Computer-Based Assessments No Panacea
Although most of the statistical analysis she and her assistant, Ph.D. candidate Kate Fink, performed involved data generated from paper-based tests, she doesn't believe the broader shift to online testing will be any kind of "a panacea."
For example, in one of her upcoming articles, Vogell will be reporting on a little-publicized problem that surfaced in the state of Mississippi, calling it "one of the worst mistakes" she wrote about. It involved online testing.
In that situation, the state required students to pass a biology test in order to graduate. For those who failed it, there was an online retest. When the test was being developed, however, "a subcontractor inadvertently screwed up some computer programming," she noted. When somebody used a mouse to choose a particular part of a diagram as the answer to one question, it would record the wrong response.
"That happened for four years undetected," Vogell said. "There were 126 kids that failed the test because of that one question being scored incorrectly."
Five of those students ended up dropping out without degrees, she added — all because of the biology test error. "Once they discovered the error last year, those five kids got their degrees. But it really wreaked havoc on a lot of kids' lives."
"There are just a million things that can go wrong," she said. "Because it's such high stakes, each child's answer sheet is really important. Having a problem that affects even a limited number of answer sheets can really be a big deal for a certain number of people."
Common Core Assessments: Will New Problems Spring Up?
As the online assessments being developed for the Common Core State Standards surface in states, said Vogell, that new format "will fix certain things for sure and certain mistakes that have been made." However, she added, "There are all kinds of things that can go wrong with the technology. A lot of the technology is new, and it's being piloted and run very quickly. It's such a dynamic environment."
She quoted the president of one testing company who told her, "Every time we make an advancement in testing, it's wonderful." At the same time, "glitches" will bubble up and everybody will need "to be patient and work through them."
Besides patience, however, something else may be required: sorting out the overall value of summative assessments and how they'll to be applied in the school and personal lives of students.
"The world is an imperfect place in every way," Vogell observed. "Certainly we can demand very high levels of quality control, and we can expect it and we can receive it in certain endeavors." She refers to air travel as an example. "With focus and resources" and oversight and regulation, rarely do serious problems happen.
"The question that follows that is, do we want to put our money into that? Do we want every test to be 30 [or] 40 bucks a kid versus five or 10 bucks a kid? Is that really where we want the resources in education to go? Or do we need to re-evaluate the uses of these tests and look at the policies that can make these errors become so consequential when they happen?"