Tom Loveless
« Back to Blog

Challenge to Standards at the Classroom Level: Spread of Achievement

Originally published on, July 1, 2021.

For the past three decades, setting ambitious academic standards has been one of the most popular tools of school reform.  My new book on the Common Core standards, Between the State and the Schoolhouse, was published by Harvard Education Press in April.  I conclude that Common Core failed and try to explain why the effort floundered, not as an opponent of the Common Core standards, but as a policy analyst who studies school reform.

In K-12 education, standards define what students should know and when they should know it, with the “when” part of the stipulation defined by grade level. Teacher surveys identify two of the biggest stumbling blocks to grade-based standards: 1) classes made up of students with vastly different abilities and prior achievement, and 2) a significant number of students ill-prepared for their grade’s curriculum.  These challenges are similar, but they are not the same problem. The first focuses on the impossible task of teaching the same curriculum to twenty to thirty students whose academic capabilities span several grade levels.  The second focuses on the particular difficulty of teaching, for example, fifth grade math concepts to students still struggling with subtraction of whole numbers, a second grade skill.

A teacher whose entire class functions two to three years below their grade’s expectations faces the impossible task of bringing them all up to grade level.  The restricted range of achievement in the classroom will not make that task much easier.  Conversely, a teacher with a class full of students functioning two to three years above grade level will find that students already know all of the learning defined as essential for that grade.  Teaching the grade-level curriculum is a waste of time. Both of these teachers need curricular wiggle room, the freedom to deviate from pre-ordained grade-level materials.

People who write and talk about standards casually dismiss how the spread of student achievement threatens standards’ effectiveness. Let’s examine the magnitude of this problem with data.

Table 1 displays National Assessment of Educational Progress (NAEP) scores from the 2019 assessment.  NAEP is one of a few valid sources for estimating the academic performance of American students. Math scores are in the shaded rows, reading scores in the unshaded rows.  The center column, 50th percentile, represents the median score, the middle point in the distribution.  In 4th grade math, for example, about half of the students scored below 242 and half scored above it (see Table 1).  It’s reasonable to consider the 50th percentile as the typical student’s score in any particular subject and grade. The 25th and 75thpercentiles define the bottom and top quartiles of performance.  One-fourth of fourth graders scored below 220 on the math assessment and another fourth scored above 262.

Analysts use 10 NAEP scale score points as a ballpark estimate of a single year’s worth of learning. Notice that the difference between the 4th and 8thgrade 50th percentile scores in math is 40 points, exactly what one would expect for two grades separated by four years.  The difference in reading is 41 points.

Also notice the differences within the same grade.  In 4th grade reading, the bottom quartile scored 197, which is 28 points below the grade level median of 225.  That means about one-fourth of fourth graders reads nearly three grade levels below what is typical for that grade, in other words, comparable to a first grader.  These are students who struggle with reading and have trouble decoding words. At the other end of the distribution, the top quartile scores 23 points higher (248) than the median fourth grade score, a difference of more than two grade levels. Like typical sixth graders, decoding is no longer a problem for them; they read fluently and grasp sophisticated nuances in text.

Don’t forget, too, that the 25th and 75thpercentile scores sit on the inner-most boundaries of their respective quartiles.  Put simply, NAEP data indicate that about half of the nation’s fourth graders functions within 2-3 grade levels of what is typical for that grade, with those students distributed over a span of six grade levels. The other half functions outside of that range!

The 8th grade dispersion of math scores is slightly larger than in 4th grade.  The 25th and 50th percentiles are separated by 27 points; the difference is also 27 points for the 50th and 75th percentiles.  Let’s describe those differences as about 2-3 grade levels. That suggests one half of all eighth graders performs within a six grade level span, from fifth to eleventh grades. And half of all eighth graders performs beyond those limits.

Sources of Variation

Some readers might be thinking, well, NAEP data depict the spread of achievement nationally, but what about in schools?  Surely, the spread of achievement will be narrower. In fact, it is narrower—but not by much.  One of the surprising findings of the 1966 Coleman Report was that a decomposition of the variance of student achievement found 87% located within-schools and only 13% residing between-schools. More recent studies have found the same. Kane and Staiger (2002) calculated, based on a large data set from North Carolina, that the within-school variance in student test scores was about 90% of the test-score variance of all students. They point out an important implication of the phenomenon: that two fourth grade students randomly drawn from the same school will differ only slightly less than two fourth graders drawn randomly from the entire population.

That’s a statistical “source of variation,” calculated by decomposing test score variance, but what about the more common meaning of “source” that implies an original cause? Perhaps schools create these disparities with their varying curriculums--and a common set of standard would fix that? No, the spread of achievement is present on the first day of kindergarten.  As mentioned in my March, 2021 post on the Brookings Chalkboard, NWEA data show that the reading abilities of entering kindergartners range from the level of a typical three year old to an eight year old, about five years.  Longitudinal studies conducted by the U.S. Department of Education (e.g., ECLS-K) have found a similar spread of achievement in kindergarten.


Standards-based reformers place great weight on “alignment,” making sure that curriculum, instruction, assessment, accountability, and other system-wide policies are wedded to standards. Imagine what that means for teachers implementing an aligned curriculum. Let’s assume that a curriculum-student mismatch occurs when skills and concepts are more than two years off from a student’s current level. If so, then a mandated grade-level curriculum is inappropriate for at least half of the students, no matter the grade.

Alignment means locking in that discrepancy for instruction, assessment, and accountability, imposing an inappropriate education on large numbers of students. Consider curriculum and instruction. If local administrators fastitidiously follow standards in purchasing textbooks and other instructional materials, many teachers will primarily be on their own in finding good materials for their students. Indeed, a RAND survey of teachers found that teachers using standards-aligned materials (as defined by were more likely to modify them.

One reason was to meet the needs of students below and above grade level.

Teachers reported that students get frustrated when the materials are too challenging and subsequently become less motivated to participate in the learning activities. When materials are not challenging enough, students also disengage and become disinterested. (p. 17)

Teachers sought a better match between students and curricular demands. The survey used the phrase “appropriately challenging” to indicate a good match. On this criterion, standards-aligned materials were considered inappropriate, and those offering differentiation were deemed more usable by teachers.

Teachers using at least one standards-aligned material were less likely (54 percent) to indicate that their materials were appropriately challenging when compared with teachers who were not using any standards-aligned material (76 percent).  (P. 20)
Teachers described materials that differentiate content for students of various achievement levels as more usable, thus associating usability with the dimension of challenge. (P. 24)


Standards-based reform dictates curricular objectives on a grade by grade basis. But a curriculum wedded to grade level standards is likely to be inappropriate for about one-half of all students, the 25% who are 2-3 years below grade level and the 25% who are 2-3 years above grade level. Teachers who teach students with such a large range of prior learning deal with the rigidity of standards by modifying teaching materials to adjust to student needs. When doing so, they place their classroom’s curriculum and instruction out of alignment with state standards.