Computerized Adaptive Testing: Effective Measurement for All Students -- THE Journal

AccountabilityAssessment

Computerized Adaptive Testing: Effective Measurement for All Students

05/01/04

##AUTHORSPLIT##<--->

Most educators today would agree that the mission of our profession has become more complex and much more challenging than ever before. Changes in district demographics, societal pressures that require more personalized student attention, evermore stringent governmental demands, and constant budgetary concerns all have an impact on our daily routines. As director of student achievement for the Meridian School District in Idaho, I've seen firsthand how these and other issues, while controversial at times and always challenging, have forced all of us to find new and creative ways to reach what we consider the ultimate goal of education: fostering continuous growth in every student to prepare him or her for the future.

Community Support

The Meridian School District experiences steady growth across the full spectrum of its students - from highly gifted children to kids with special needs. While we benefit from unusually strong community support and a nationally recognized reputation for high student achievement, a large part of our success is due to the quality and amount of data that guides us.

One reason we're successful is that our testing methods assure more effective teaching, more substantive learning, and better-prepared students. Our initial search in this area, which began seven years ago, led us to partner with the Portland, Ore.-based Northwest Evaluation Association (NWEA, www.nwea.org) to initiate an achievement-level test - first in paper-and-pencil form and later via computer.

Uniquely adaptive, this computerized test automatically presents each student with different items based on ability level and prior responses. When the student answers a question correctly, subsequent questions become more difficult, while incorrect answers lead to easier questions. The tests help eliminate student frustration and boredom, and offer results that provide a solid foundation of quality data delivered in days, not months.

Unlike traditional standardized tests that measure a student's status compared to others, computerized adaptive tests (CATs) enable us to track the growth of each student in specific subjects over time. This allows us to see and foster ongoing individualized improvement. Besides years of strong, continuous growth, the testing system has found unanimous support from teachers, administrators, parents and even students. The test was so successful in our district and many others statewide that Idaho contracted with NWEA to develop our state test, the Idaho Standards Achievement Test (ISAT). This test is a blended solution of a CAT and a fixed-form test designed to meet No Child Left Behind mandates.

New Challenges, New Solutions

Our district has seven Title I schools and our English language learner (ELL) population is increasing - currently including 51 language groups. While the district has the lowest poverty level in the state, the span of poverty in the buildings is from 4%-58%. The district also has a high percentage of special education students - with more than 2,900 students (about 11%) following individualized education programs - which is largely a result of multiple group homes that are located in the area.

Like most school districts, the dilemma we face is striking the right balance in our instructional processes to meet the needs of every student, whatever his or her ability level. Under the NCLB requirement to show adequate yearly progress (AYP), Meridian schools have demonstrated proficiency levels ranging from 85%-94%. However, impressive as that sounds, AYP requirements would force us to focus primarily on the needs of children who have not hit proficiency standards - as few as 6% of our students. That leaves a large number of students getting less of our attention, which is not acceptable since we cannot ignore the other 94%.

Cathy Thornton, principal of Frontier Elementary School in the Meridian district, sees the diversification of her students, including special-needs children, as a primary cause for required change in testing techniques. Frontier has a disproportionate number of special-needs students for our district, with 100 of its 650 students (15%) classifying as ELL. In addition, 41% of its students are on the free and reduced-price lunch program.

"In the past, we were able to simply 'shoot for the middle' and hit most everybody's needs," says Thornton. "Today, though, needs are greater, and we must have more information in order to pinpoint the areas of instruction of real value to each student." Thornton relies on the CAT data to provide that information.

CAT data can also help guide instruction for gifted students. Traditionally, educators have delivered instruction to the largest population in the classroom - the average student - with the expectation that very bright students would continue to thrive and progress on their own. We now know that it's the top percentile of the student population - the truly gifted - who oftentimes don't display adequate growth simply because they're not introduced to new and challenging material.

Christine Lawrence, a gifted facilitator for Meridian schools, concurs: "Studies have shown that most gifted kids actually show negative growth at the high ends, which isn't surprising. It's difficult to grow at the high end because there's not much you don't know, and what you don't know is pretty complicated." She relies on the CATs to identify individual strengths and weaknesses, as well as to recognize areas in which gifted students can be challenged.

Making Sure No Child Gets Left Behind

Being able to identify and meet the needs of all students is why our district began using computerized adaptive testing several years ago. This use has resulted in a huge paradigm shift in our methods of instruction and buildingwide systemic changes, which have accelerated over the last three years. The test data is part of our ongoing, ever-evolving way to get a more complete view of our students and their potential for growth - an objective that requires solid information.

My metaphor to explain the type and value of information we must have is simple: it's like looking at data through four very different lenses. First, there's the achievement lens, which reflects where a student, class or district ranks on a scale compared to the norm. Then there's the proficiency lens, the tool that's used to determine grade-level requirements and standards. The third, and most powerful, lens is the growth lens, which is a way to expose progress made and measured annually, as well as to help assess the need for instructional modifications. CAT data meets all three of these goals. The fourth lens is the instructional lens, which focuses on the relative strengths and weaknesses of the student as measured by the subtests.

With this combination of data, we have a complete view of each child. We can also fashion growth targets measured throughout each school year, as well as across grade levels and years, to determine whether a student is meeting or exceeding his or her personal goals. Of course, assuring success necessitates different types of instruction depending on the student. With special-needs children, for example, you need to combine skill instruction at their particular instructional level and grade-level instruction every day. Thus, a child who might be far behind can achieve more than a year's worth of growth (through accelerated learning) to reach his or her true target.

Testing for Better Results

Starting in spring 2002, the entire state of Idaho moved to the new "blended solution" ISAT. For the districts that had been using NWEA's CAT to supplement the former state test, this meant they would have less testing while still getting the same quality and depth of data.

The ISAT, delivered via computer, comprises two different parts. The first is a fixed-form, grade-specific test that has traditionally been used to provide information aligned to grade-level content and achievement standards as required by NCLB regulations. Following that is a CAT that presents items to each student based on his or her ability level. This portion of the test is based on NWEA's unique computerized "Measures of Academic Progress" assessment system, which is designed around an equal-interval growth scale to provide the most accurate test available.

To measure the effectiveness of this unique test format, NWEA gathered research comparing student results on the fixed-form portion of the test with the computerized adaptive section. Conclusions from the analysis came as no surprise to me or our teachers. While the fixed-form portion of the test, like most measures developed for NCLB, provides adequate data relating to midrange students, it fails to effectively measure students at the high and low ends of the ability spectrum. So, gifted children aren't presented with questions that really challenge them; thus, they lose interest. This leads to the "Smart Kids Left Behind" dilemma: If we don't get a clear picture of what these students need to grow academically, they may not get the resources necessary for their continued progress.

Children at the other end of the spectrum (i.e., those working below grade level) tend to become discouraged by questions too difficult for their abilities. Among both of these groups, we see students tuning out either due to frustration or boredom. However, when students move on to the computerized adaptive portion of the test - with items individualized to student ability levels - both strugglers and advanced learners re-engage.

Therefore, it comes as no surprise that the computerized adaptive portion of the test provides four times more information about how to help each child, thereby enabling highly informative, individualized decision-making. Equally precise for all students, CATs also provide characteristics of performance; for example, not just high or low ratings in reading, but high or low ratings in reading comprehension. Our bottom-line findings are that the adaptive portion of our current test process can effectively measure 99% of the students in our district.

Improving Education

Meridian's years-long efforts to implement data-driven methods based on the adaptive tests' ability to inform and accelerate learning is just one of the reasons our district has become known as the "Continuous Improvement District." Historically proactive, Meridian has established what we call a "Site Improvement Plan" for each school in the district, which is now in place three or four years ahead of the state's 2005 deadline for such a process. The improvement process, aimed at strong and ongoing student growth, is relatively simple:

Analyze data from the previous year to determine where improvements need to be made;
Set goals and develop action plans to implement the goals;
Analyze the current year's data to see if the changes made have led to improvements;
Review progress made; and
Start the cycle over again.

One thing we know for sure is that without the data garnered from our current computerized adaptive testing, Meridian's process for site improvements would not be possible. As we discovered years ago, traditional models of testing like the Iowa Test of Basic Skills may work to rank and sort students, but they do not work well to inform instruction.

Lawrence has seen dramatic changes in Meridian's gifted program due to growth-focused testing and data retrieval. "I can look at any of my kids and tell you where their strengths and weaknesses are based on the test," she says. "It's a powerful tool we're able to use to direct instruction, especially for gifted kids."

Special education students benefit too, according to Thornton: "We've seen ELL students who started out not speaking any English at all, and in just a year's time they score way above grade level."

From the get-go, our success with level testing provided teachers with the power they needed to inform instruction and affect each student's growth. Now we have good, solid data which has led to an effective process that monitors proficiency and achievement, but also is primarily focused on individual growth. In that way, we're now better equipped than ever to meet our ultimate goal: successfully preparing every child for a promising, productive future.

Thornton sums up the changes and improvements in her school and throughout the Meridian district this way: "We started all our work long before NCLB was mandated. What drove us is that we realized we have different students needing different kinds of instruction, which better data has helped us deliver. Our motivation [is the] belief that every student, at every level, needs the right education for where they are right now. In that way, they can all learn and grow."

Adaptive Testing: What D'es NCLB Allow?
By Allan Olson, Executive Director, NWEA

The Northwest Evaluation Association began using computers for assessment in the early 1980s. Since then - and especially since 1998 when we developed the current version of our computerized adaptive test (now used in school districts nationwide) and more recently as we developed Idaho's state assessment system - we have been closely monitoring the U.S. Department of Education's position on computerized adaptive testing to meet federal assessment guidelines.

The current No Child Left Behind Act and the regulations that accompany it were designed to measure grade-level progress to determine whether all students in one grade have improved compared to students in the same grade from the previous year. As such, the extensive assessment requirements of the legislation mandate that all students be held to the same grade-level standards, which has sometimes been misinterpreted as precluding computerized adaptive testing.

However, according to Zollie Stevenson Jr., group leader for standards, assessments and accountability at the Department of Education, computerized adaptive testing is acceptable as long as the tests include "a core of items that are aligned with both the content and the achievement standards for the grade in which the students are enrolled."Idaho's model is a blended approach that includes the core items mandated by the regulations, as well as an adaptive element that provides additional information that may be used by teachers and schools with little extra expense of student and teacher time. The blended approach meets both the letter of the law (i.e., to provide the kind of data required) and the spirit of the law (i.e., using test data in efforts to help all students grow).

If we are to "build the mind and character of every child, from every background, in every part of America," as President George W. Bush envisioned when he signed the law, we need to know as much as possible about each student, and we need to concentrate on helping each student grow to the standards and beyond.

"What Idaho is attempting to do in the blended model seems to be working," says Celia Sims, assistant in the Education Department's Office of Elementary and Secondary Education. "Idaho has a long history of using the adaptive testing for other purposes in the classroom, which we all agree are important. The blended model allows them to meet their teacher and school needs and the NCLB needs simultaneously in one instrument that's not extremely long."

Now, more than two years since the NCLB mandates went into effect, Idaho is working under a compliance agreement with the Education Department. While the state expects to meet all NCLB requirements on time, its assessment system has not yet undergone the rigorous peer-review process that leads to final approval.

As the article, "Computerized Adaptive Testing: Effective Measurement for All Students," on the Meridian School District shows, Idaho is not waiting for the government's final say to maximize the value of the tests. Since 2002 (and well before this for many Idaho districts), the data has been informing instruction statewide. "Idaho found a way to make it work," agrees Sims. "Their blended solution meets the needs."

For more information about NWEA, visit www.nwea.org.

E-Mail this page

Printable Format

Featured

Anthropic Criticizes OpenAI Ad Strategy in Super Bowl Campaign

Anthropic recently launched a multi-million dollar Super Bowl advertising campaign criticizing OpenAI's decision to start showing ads within ChatGPT.
Meta Pushes into Physical AI with Acquisition of Robotics AI Startup

Meta Platforms has acquired Assured Robot Intelligence, a robotics artificial intelligence startup focused on humanoid systems, as the company expands its AI work beyond software and into models that could help robots operate in physical environments.
Report Explores How AI Is Reshaping Foundational Workforce Skills

The Burning Glass Institute and the AI Education Project (aiEDU) have released a new analysis of how generative AI is transforming the skills students need to succeed in the workforce and beyond.
Report: AI Budgets in Education Show No Sign of Decline

The vast majority of education organizations (98%) expect their AI infrastructure budgets to either increase or hold steady over the next year, according to a report from cloud storage provider Wasabi.