Tom Loveless
« Back to Blog

More Evidence that the CA Math Framework Cites Flawed Research

In Education Next I criticized the proposed California Math Framework for basing recommendations on bad evidence. The State of California now seems to agree that using a particular assessment, going by the acronym of MARS (a collection of math tasks) to evaluate student achievement is unwarranted.  In May, the California State Board of Education considered and rejected the test when issuing a List of Valid and Reliable Assessments required by California Education Code Section 47607.2.

The Education Next article singled out a study of a summer Youcubed math camp that claimed to increase student achievement by 2.8 years after 18 days of instruction.  Jack Dieckmann of YouCubed offered a rebuttal to my criticism and I responded, with both statements published in Education Next. Two paragraphs from my response summarize the argument.

I focused on outcomes measured by how students performed on four tasks created by the Mathematical Assessment Research Service (MARS). Based on MARS data, Youcubed claims that students gained 2.8 years of math learning by attending its first 18-day summer camp in 2015.  Dieckmann defends MARS as being “well-respected” and having a “rich legacy,” but offers no psychometric data to support assessing students with the same four MARS tasks pre- and post-camp and converting gains into years of learning. Test-retest using the same instrument within such a short period of time is rarely good practice. And lacking a comparison or control group prevents the authors from making credible causal inferences from the scores.”

Is there evidence that MARS tasks should not be used to measure the camps’ learning gains?  Yes, quite a bit. The MARS website includes the following warning: Note: please bear in mind that these materials are still in draft and unpolished form.” Later that point is reiterated, “Note: please bear in mind that these prototype materials need some further trialing before inclusion in a high-stakes test.” I searched the list of assessments covered in the latest edition of the Buros Center’s Mental Measurements Yearbook, regarded as the encyclopedia of cognitive tests, and could find no entry for MARS. Finally, Evidence for ESSA and What Works Clearinghouse are the two main repositories for high quality program evaluations and studies of education interventions. I searched both sites and found no studies using MARS.

In the latest version of the framework, released near the end of June, references to the summer camps have been removed. But references to another Youcubed study using MARS data remain. The framework cites a 2021 study (Boaler and Foster, 2021) to endorse heterogeneous grouping in middle school, reproducing two figures (see Figs. 9.1 and 9.2) to document the claim that students in detracked, heterogeneously grouped middle schools out-performed students grouped by ability, asserting a gain “equivalent to 2.03 years of middle school growth.”[i]

Since I published the Education Next article, additional evidence has emerged that calls into question using MARS tasks to evaluate achievement gains. On May 18, 2023 the California State Board of Education considered and rejected the assessment (also known as MAC/MARS from its use by the Silicon Valley Math Initiative) for assessing achievement in charter schools.  Interestingly, the assessment review was conducted by WestEd, the same firm that edited the framework over the past year.

MAC/MARS failed the first step in the review, consideration of technical quality. The review considered four criteria, including validity and reliability (see page 17 of May Item 2 documentation).  MARS did not meet state standards for technical quality.

Tomorrow, July 12, the Board will vote on the math framework. The Board is now in the strange position of voting on a framework that uses as supporting evidence results from an assessment that the board itself rejected in May.



[i]Mike Lawler (@mikeandallie on Twitter) has identified numerous flaws in the Boaler and Foster, 2021 study.