Automated Engines Score Essays Like Humans

A study from the PARCC found that essays graded by computers matched those of humans based on various performance metrics.

A report from a national testing organization found that the performance of automated scoring engines matches that of human scorers.

The Partnership for Assessment of Readiness for College and Careers (PARCC), a consortium of states working to create a standard of K-12 assessments in mathematics and English language arts in alignment with the Common Core State Standards Initiative, recently released a report on the viability of computer-scored essays. The study was conducted in 2014 and published in 2015, but the report was not widely available to the public until the Parent Coalition for Student Privacy and other parties wrote a letter to state commissioners urging the PARCC to be more widely available.

Pearson Education and Educational Testing Service (ETS) together participated in the research study to test and compare the PARCC’s automated scoring against human scoring. The joint study included 75 prompts, spanning multiple grade levels and task types. Both Pearson and ETS first trained their scoring engines on the prompts using correlating human-scored responses. Then, Pearson and ETS fed an unseen set of student essays to their scoring engines and compared the results to human-scored responses on the same unseen set. Performance was based on grade level, trait and type of prompt. The study revealed that, on average, the performance of the automated scoring engines matched that of the human scorers, and only essays from grade three performed slightly below human-scored tests.

Parents and advocates addressed their concerns about automated scoring, citing the “inability of computers to assess the creativity and critical thought that the Common Core standards were supposed to demand” in the letter. They wanted more information from the PARCC, such as the percentage of computer-scored tests that were re-checked by humans and what happens when machine scores vary significantly from scores from humans.

Despite the call for more information, several of the PARCC states will start using scoring engines to judge essays this year, according to an online report. This year, about two-thirds of all student essays will be scored automatically, while one-third will be scored by humans. In addition, about 10 percent of all responses will be randomly selected to receive a second score as a precaution. States can still opt to have all essays hand-scored.

About the Author

Sri Ravipati is Web producer for THE Journal and Campus Technology. She can be reached at [email protected].

Featured

  • Abstract AI circuit board pattern

    Nonprofit LawZero to Work Toward Safer, Truthful AI

    Turing Award-winning AI researcher Yoshua Bengio has launched LawZero, a nonprofit aimed at developing AI systems that prioritize safety and truthfulness over autonomy.

  • abstract pattern of cybersecurity, ai and cloud imagery

    Report Identifies Malicious Use of AI in Cloud-Based Cyber Threats

    A recent report from OpenAI identifies the misuse of artificial intelligence in cybercrime, social engineering, and influence operations, particularly those targeting or operating through cloud infrastructure. In "Disrupting Malicious Uses of AI: June 2025," the company outlines how threat actors are weaponizing large language models for malicious ends — and how OpenAI is pushing back.

  • tutor and student working together at a laptop

    You've Paid for Tutoring. Here's How to Make Sure It Works.

    As districts and states nationwide invest in tutoring, it remains one of the best tools in our educational toolkit, yielding positive impacts on student learning at scale. But to maximize return on investment, both financially and academically, we must focus on improving implementation.

  • red brick school building with a large yellow "AI" sign above its main entrance

    New National Academy for AI Instruction to Provide Free AI Training for Educators

    In an effort to "transform how artificial intelligence is taught and integrated into classrooms across the United States," the American Federation of Teachers (AFT), in partnership with Microsoft, OpenAI, Anthropic, and the United Federation of Teachers, is launching the National Academy for AI Instruction, a $23 million initiative that will provide access to free AI training and curriculum for all AFT members, beginning with K-12 educators.