Large grade dataset points to AI-driven inflation in unsupervised coursework
A new study from UC Berkeley adds empirical weight to a growing concern in higher education: generative AI may be making grades look better without improving learning. Drawing on more than 500,000 grades from a large selective public research university in Texas, the study found that after ChatGPT launched in November 2022, grades rose sharply in courses with assignments that AI can handle well, especially writing and coding.
The increase was not spread evenly across all course types. According to the study as described in The Decoder, the effect was concentrated in classes where homework counted heavily toward the final grade. That distinction matters. If AI tools were genuinely helping students learn more, researchers would expect gains to appear across assessment types, including proctored exams. Instead, the biggest jump appeared in unsupervised work, a pattern more consistent with AI substituting for student effort.
The size of the shift
The study tracked eight fall semesters, from 2018 through 2025, covering 319 courses across 84 departments. To estimate how exposed each course was to generative AI, the researcher used fall 2022 syllabi, created before ChatGPT existed, and measured the share of assignments centered on writing and coding. Those were the tasks most likely to be affected once widely available AI tools arrived.
The post-ChatGPT change was substantial. In courses with a high share of writing and coding assignments, the share of A grades increased by 13 percentage points, roughly 30 percent above the 2022 baseline. Average GPA rose by 0.12 points. At the same time, the grade distribution narrowed, with students who might previously have received grades such as A-minus or B-plus increasingly ending up with straight A marks instead.
That is a notable pattern because it suggests not only higher average performance on paper, but also reduced differentiation between students. In practical terms, grades may be becoming less informative as signals of who mastered the material most strongly and who merely completed the work acceptably.
Homework, not exams, appears to be driving the change
The study’s most important contribution may be its attempt to distinguish learning gains from outsourced work. The researcher examined how much homework contributed to final course grades. If AI was helping students understand material better, improvements should have appeared whether a class relied on homework or on in-person exams. If, by contrast, students were using AI to complete assignments directly, the strongest effects should appear where unsupervised work carries the most weight.
That second scenario is what the data favored. In courses where homework accounted for more than the median share of the final grade, the rise in A grades was an additional 16 percentage points higher than in lower-homework courses with the same level of AI exposure. In courses where homework mattered less, the effect was small and not statistically significant.
That pattern is difficult to explain as a broad increase in student learning alone. It points instead to a structural vulnerability in how many courses are designed: when grades depend heavily on take-home writing or coding tasks, AI systems can now perform enough of the work to reshape the grading distribution.
A placebo test strengthens the case
The study also included a useful comparison. Oral presentation assignments, where current AI tools are less directly helpful, did not show the same grade inflation effect. That placebo test does not prove causation by itself, but it strengthens the interpretation that assignment format matters, and that the observed changes are closely tied to the kinds of work generative AI can complete or significantly assist with.

In other words, this was not simply a campus-wide drift toward easier grading after 2022. The increase was aligned with the specific domains where ChatGPT-like systems are most capable.
Why this matters beyond one university
Universities have dealt with grade inflation for decades. What makes this moment different is that generative AI may be accelerating the process in a way that undermines one of the basic functions of assessment. Grades are supposed to communicate something about performance, knowledge, and relative mastery. If AI allows many students to produce polished homework without proportionate understanding, those signals weaken.
The implications extend beyond transcripts. Employers, graduate schools, scholarship committees, and even instructors in later courses rely on grades as rough indicators of what students can do. If an A increasingly reflects the quality of AI-assisted output rather than demonstrated competence, the credibility of that signal erodes.
The study also raises a pedagogical challenge. Writing and coding are not peripheral assignments in modern universities; they are central to how many disciplines teach analysis, problem-solving, and communication. That means institutions cannot simply eliminate the affected formats without changing the substance of education itself. Instead, they may need to redesign assignments, increase in-person or supervised evaluation, or place more emphasis on oral defenses, drafts, process documentation, and other methods that make learning visible.
What the research does not claim
The study, as summarized in the source material, does not claim that all students are misusing AI or that any AI assistance automatically undermines education. It also does not say that learning has not improved for any students. Some students may well be using AI as a tutor, editor, or debugging aid in ways that support understanding.
But at the aggregate level, the evidence presented here points in a different direction. The strongest grade changes occur where AI can most easily replace unsupervised student work, not where students must independently demonstrate knowledge under controlled conditions.
A warning for the next phase of higher education
Generative AI is now built into the academic environment. The question is no longer whether students have access to it, but how institutions respond. This study suggests that if course design remains unchanged, grades may continue to drift upward while becoming less meaningful.
That does not make the problem purely disciplinary. It is also an assessment-design problem. Universities that want grades to retain value may need to move quickly to separate assistance from substitution and to create more ways for students to show what they can do without outsourcing the core intellectual task.
The broader significance of the study is that it quantifies a change many instructors have suspected since late 2022. The ChatGPT era may not simply be altering how students work. It may be changing what academic grades measure at all.
This article is based on reporting by The Decoder. Read the original article.
Originally published on the-decoder.com








