Research Asks If Machine Learning Can Make Better Enrollment Predictions

Could artificial intelligence help school districts better predict their fall enrollments? That was the question a team of researchers asked on behalf of the School District of Philadelphia.

Philadelphia's school system, like many others, offers residents school choice. Families may choose to send their students to the neighborhood school or to another one elsewhere in the city, including public charters or private schools. That flexibility poses a planning problem: Each year schools have no idea about the size of their incoming cohorts. While spring planning may tell district and site leaders that students intend to go to their local school, in the intervening months families may choose to send their kids elsewhere.

The district allocates resources in March for the upcoming school year and then reallocates resources as needed after October 1, once it has taken measure of discrepancies between the prospective spring enrollment and actual attendance in the fall, a process the district calls "leveling." The October leveling process can result in the elimination of classrooms, reassignment of students to different teachers and shifting of teachers to different schools or grade levels. This kind of disruption can have a big impact on classroom engagement for students

To see if there was a better way to plan for fall enrollment, the district turned to the Regional Educational Laboratory Mid-Atlantic, part of the federal Institute of Education Sciences network. In a project administered by Mathematica, researchers developed three machine learning algorithms for doing student enrollment prediction, to see if the results were any better than a "simple regression model" — figuring out which variables matter and which ones don't, and then applying the ones with impact to the calculations.

The AI didn't do any better. According to a recently published report and infographic, all four methods had similar accuracy, differing from actual fall cohort sizes by about six students, on average, including across schools "with larger proportions of black students, economically disadvantaged students and English learners." Each method resulted in the need to reallocate between 20% and 30% of students to different teachers in the October recount.

While machine learning didn't generate any better results, the research project did come up with some guidance:

  • First, for districts with high rates of mobility, predictions would probably be refined if the schools gathered additional data later in the spring and early summer; and

  • Second, while any of the methods would generate similar predictions, regression models are easier and less expensive for districts to implement.

Also, while the research used 259 predictors from multiple schools (2016-2017 through 2018-2019), just a handful of variables really makes a difference:

  • The current enrollment for each grade, which had a predictor importance of 0.97;

  • The number of students with more than five in-school suspensions (0.91);

  • The number of students with more than five out-of-school suspensions (0.83); and

  • The number of students with fewer than six absences (0.65).

Finally, the researchers offered a caveat: The results may not be applicable in years to come since COVID-19 has wreaked havoc. The pandemic, the report stated, "might have fundamentally altered patterns of attrition and new entrants in a way that models based on historic data are unable to capture."

Details are openly available on the Regional Educational Laboratory website.

About the Author

Dian Schaffhauser is a former senior contributing editor for 1105 Media's education publications THE Journal, Campus Technology and Spaces4Learning.

Featured