Assessment | September 2012 Digital Edition

As High-Stakes Online Testing Approaches, Will Your iPads Work?

As more schools invest in student iPad programs, one question still unanswered is whether or not those devices can be used for the high-stakes online tests coming in 2014.

Illustration by Anne Kobayashi

This article, with an exclusive video interview and interactive slide show, originally appeared in T.H.E. Journal's September 2012 digital edition.

Imagine a roomful of students taking their end-of-course English exams. One of those hunched over his desk, unbeknownst to test proctors, is outfitted with a slim mobile device. Furtively, he presses a button and silently takes an image of the test page, intending to share it with other students. No, we're not talking about the prestigious Stuyvesant High School in New York City, where 70 students were caught up in a June cheating scandal involving the use of smartphones to share test questions.

This is a plausible scenario that Tony Alpert grapples with as he ponders whether to allow tablet computers--and particularly iPads--to be used for summative testing online. As Alpert, chief operating officer for the Smarter Balanced Assessment Consortium (SBAC), points out, not only would student cheating compromise the validity of the individual student's test event, "worse yet, it could expose elements of the item bank which would be very expensive for the consortium and would compromise the validity of other students' tests."

In other words, once test questions were exposed and possibly made public, not only would the consortium have to remove the questions from the pool of available items, doing so would be expensive--upward of $2,500 per item.

SBAC is one of the two consortia funded by the US Department of Education to manage the development of the next generation of assessments that will measure student progress in math and English and language arts. SBAC represents 22 million students in 27 states. The other, the Partnership for Assessment of Readiness for College and Careers (PARCC), covers 25 million students in 23 states and the District of Columbia. Some states participate in both organizations. (PARCC did not respond to requests to be interviewed for this feature.)

To understand what's out there in school districts today in the way of computers to use for online assessments, the consortia jointly contracted with Pearson to develop a technology readiness tool, an open source utility that maintains an inventory of a district's current capacity and compares that to technology needed to administer the online tests in four areas: devices, ratio of device to test taker, network infrastructure, and staff and personnel. The results of that initial survey effort, which is being performed within every consortia state, district, and school, were expected to be released in August, if not publicly, at least to participating organizations.

But Alpert doesn't need survey results to know the one big question on a lot of people's minds. "If I put my foot out the door, and there happened to be someone from a district, they would ask me, 'Is the iPad going to be allowed?' And every single one of our member states experienced the same question. It was possibly the most common question that the districts asked."

Apple publicly claims that 1,000 schools--both K-12 and higher ed--now have iPad programs in place, and more are coming online every month. So both SBAC and PARCC are considering the challenges of security, usability, and content that might arise when students are taking tests on tablet devices and discussing how these might be solved. What they decide over the next two years may forge the profile of equipment purchases in schools for years to come.

Locking Down a Test
Because the iPad is such a lightweight and portable device, schools may find out that they need to handle them with the same care and caution they use with paper-based test books. Of course, the potential problems that come with introducing iPads into high-stakes testing don't stop there.

For example, until Apple announced iOS 6 for the iPad, there was no simple way to lock down the device to limit its use to a single app. A student could simply press the home button to minimize the current application and search the web, capture a screen image, or get to a file stored on the device with the flick of a finger.

Pete Poggione, IT director for the Mattawan Consolidated School District in Michigan, will shortly be adding 60 iPad 2s to his schools' inventory of student devices. This initial iPad cart pilot will target kindergarten through the fifth grade. He was relieved when Apple went public with news about "guided access," a feature in iOS 6 that will allow an administrator to disable the home button and restrict touch input to certain areas of the screen. "If it does what it's supposed to do, this is going to be a game changer for high-stakes testing," Poggione proclaims.

Even before he knew about the guided access feature, though, this hard-core Mac fan was determined to be able to deliver testing on the iPads, no matter the effort. "I would have to come up with some sort of solution," Poggione says. One idea was to maintain a "separate cache of iPads" whose onboard storage would be wiped clean and have only the testing app installed. On top of that, he figured he'd find some third-party app to lock down the browser, put all of the iPads on a specific wireless segment, and lock them down at the firewall so users could go to only one place--the place issuing the test being taken.

"It would have to take the old-school kind of measures from a network standpoint, as opposed to letting the device take care of it," he notes. "I'm glad to see that Apple is promising something that's supposed to take care of that."

Legacy Devices
Security factors aside, SBAC and PARCC--and the schools they ultimately serve--need to consider other aspects of the testing process, like what kinds of questions are possible on the new tests and what kinds of devices those will work best on. Although the details of how tests will be made available to students and what they will consist of are still being sorted out, one thing is clear: What can't get in the way of the process is the mechanism upon which the student takes the test.

SBAC and PARCC both face a rickety bridge of legacy hardware. While wanting to respect the fact that many districts are burdened with older-generation computing gear, the consortia still want to get schools to the promised land of delivering tests that will by their nature require newer forms of equipment. After all, these are supposed to be "next-generation assessments." That means making a leap over what could simply be delivered on paper.

SBAC, for example, is contemplating a requirement for the 2016-17 school year that students work on tablets--or some kind of touch-based or even motion-sensing interactive device--to get through a portion of the math assessments that will be introduced in that time frame.

"One of the big concerns that we're trying to address is, how do we allow students to express their mathematical reasoning?" Alpert states. "Some students do that with words, some students do that with symbols and equations, some kids do it with pie charts and bar graphs and little X's and whatnot. So for students to be able to describe not only the answer, but also how they derived the answer, requires a degree of flexibility that a mouse really doesn't work too well for, for most students. If we want to capture that information electronically to reduce the costs of scoring and enhance the quality of our assessment--make it even more meaningful--then we have to chart out a pragmatic path towards requiring a device that supports students and allows them to demonstrate that understanding."

The path SBAC is currently following makes tablets optional for now; but that may change whether schools are fully ready or not. To make the shift in direction easier, tablets may be only required for upper grades at first, especially if the testing agencies find out that touch-based devices are more common there than in lower grades, Alpert adds.

SBAC is going even further than that by saying that some specific portion, say 25 percent, of a math test might require a tablet. It's possible that a more traditional form of a test might stop for students when they come to a particular section. At that point, the test administrator would hand them a touch-based device to do that portion of the exam.

"So it's not a one-to-one issue. Every student doesn't need to have (a tablet every moment of testing)," Alpert points out. "We want to create options where we can enhance the meaning of the assessment, but still be aware that districts' budget issues aren't going to ease up any time soon."

Usability Issues
To fully understand the impact of variations in computing devices, SBAC is running usability tests, in which students try out different kinds of test items on different kinds of devices and then are asked a series of questions about their impressions. For example, how important is the type of keyboard made available to the student? If a test taker is expected to type out a response to a question, which one is easier to use: a traditional "mechanical" keyboard or a virtual one? If students are used to composing on paper, asking them to perform during a high-stakes test on a keyboard may put them at a disadvantage.

One company with a stake in the outcome is Pearson. Not only did it get the contract to develop the technology readiness tool for both consortia, it's also vying alongside other contenders for the work of providing the assessment platform that will host and deliver the high-stakes tests to school computers all over the country. Addressing device variations will be an important part of the work for whoever wins that contract.

Bryan Bleil, vice president of online and technology implementation in Pearson's assessment and information division, notes there are several potential issues with keyboards.

First, when the keyboard on a tablet is displayed, it obscures "half or more of the screen." That leaves less room to show the question and potential answers. In two decades of research examining the relative merits of paper versus online tests, he says, when students have to scroll around to see the entire question and all of the potential answers, "they perform more poorly than the students who can see the entire thing on one page."

Second, on-screen keyboards on tablets incorporate the use of auto-correct features. Most word processing users enjoy this feature because it corrects their blunders in spelling and grammar. Obviously, students' answers shouldn't be improved upon by auto-correct; but, worse, that same advantage wouldn't exist for non-tablet users.

The third issue deals with keyboarding skills. Using a finger as a pointer instead of a mouse to position a cursor or select a part of a sentence takes a level of dexterity that some children don't have. Plus, the hand can get in the way of what needs to be read on the screen.

Also, the tablet's tap keyboard works differently than a mechanical one. Some people believe that eventually touch-typing could become a lost art and the QWERTY keyboard hieroglyphics, but for now these kinds of differences matter to psychometricians--testing experts. "They're typically doing some kind of comparability adjustment, and the best practice is to redesign your test question so that it works for both," explains Bleil.

SBAC's Alpert adds that if students find the variations "more difficult than it should be, then we may just make it a requirement for mechanical keyboards." That could include mandating a mouse too.

In the end, adds Bleil, while iPads and other tablets are "mostly like desktops and laptops in terms of their interface," it's not a 100-percent match. "It's in those places where it's a little different, where you don't have a mouse in your hand, where you're doing things with your finger instead of a keyboard or a mouse, that you have the greatest potential for differences." Those differences and their impact are where more research needs to be done, he says.

An Awkward Transition
Consortia insiders anticipate an awkward transition period in which some schools will have tablets, others will have traditional computers, and still others will have both. "The summative assessment needs to be delivered in such a way that it doesn't necessarily advantage or disadvantage (any of) those populations," says Pearson's Bleil. Going that route, however, exacts a price: "When you're in that middle space, you're constrained from realizing the full potential of the new medium."

If it's hard to envision how a test on a tablet could vary that much from one on a standard computer, it may be because we're so used to thinking of tests as being a multiple choice kind of activity. Bleil suggests imagining this scenario: A student performs a mix of online and offline activities and then records what's been done using the camera on the tablet and includes that as part of a digital portfolio. "There are a lot of researchers out there who will tell you, there's some really rich and deep assessment that can happen with that." He concedes, however, that it would be a tough approach to pull off in a district with a mixed population of devices.

In the meantime, the consortia face this question about devices: Does it focus on the rearview mirror as it drives, or does it aim for the horizon? Current signs point to a pragmatic mix of both, with at least one of the two organizations showing a strong inclination to cut loose and head toward the blue sky. Instead of crossing that rickety bridge of legacy hardware mentioned earlier, assessments for the present will probably have to straddle the chasm, supporting the older technology in many schools today, while also stretching out to exploit newer technologies and what they have to offer.

It's not just a single road the consortia are driving down either. Just as 32 states and Washington, DC, have received waivers exempting them from the penalties for not meeting No Child Left Behind proficiency standards, states may find themselves pursuing the exemption route for the new assessment exams too. That could give them the extra time they probably always knew they'd need to figure out how finally to fund the big technology purchases the tests will require.

And, while the iPad may be the preferred device today among many users, another vendor could come along with something that would sweep away that dominance. School district budgets drive the buying decisions, which may force some to forgo the iPad and choose from the myriad of more competitively priced Android or Windows 8 tablets, with each vendor iteration introducing its own qualities and quirks. In turn, that will increase the complexity of the job to develop tests that are supportable by districts.

In many ways, Bleil says, the assessments the consortia will eventually introduce and the equipment SBAC and PARCC ultimately recommend "may represent the closest thing we've got to any kind of national technology requirement for education…. The requirements that they publish will be very broadly used and will affect the education market more significantly than many may realize." His recommendation to technology directors? "Watch those requirements carefully."

Is BYOTD (Bring Your Own Testing Device) Realistic?

In a survey earlier this year, the Pearson Foundation found that ownership of iPads had quadrupled, rising from 4 percent in 2011 to 17 percent at the same point in 2012. Is it possible that K-12 students might one day take high-stakes tests on their own devices?

While Tony Alpert, the chief operating officer for the Smarter Balanced Assessment Consortium, doesn't rule it out, he thinks the hurdles would be quite high. Presuming that the device lockdown and other security issues could be addressed on a student device, trying to distinguish which non-school devices to allow and which ones to reject would place an additional burden on the testing administrator or proctor, "on top of what's otherwise a pretty hefty schedule."

Second, there are risk factors for the consortia to consider. If a district or school decides to give BYOTD a try and then fails in its efforts to secure the test, every single exam item issued to students would be put at risk. Since adaptive testing decreases the likelihood of any two students getting the same exam or the same questions, a very large set of items conceivably could be removed from the pool altogether.

But it's possible, Alpert notes, given enough urgency by the states making up each consortium, that one or another could one day take on the challenge of creating a default list of policies to be adopted by a school or district "if they wanted to go that route." However, the policies would have to be general enough "to work for everyone" and specific enough to work for anyone.