Meta's Llama and Other AI Models Show Hidden Racial Bias in Student Feedback, Stanford Study Reveals
A Stanford University study analyzing 600 student essays found that artificial intelligence models, including Meta's Llama large language model, consistently show racial and gender bias when providing writing feedback to students. Researchers submitted the same essays multiple times with different demographic labels, revealing that AI systems praised Black students more often, corrected grammar more harshly for English learners, and used emotionally engaged language disproportionately with female students.
What Did the Stanford Researchers Actually Find?
Three Stanford University researchers, Mei Tan, Lena Phalen, and Dorottya Demszky, conducted the study titled "Marked Pedagogies: Examining Linguistic Biases in Personalized Automated Writing Feedback," published in March. They tested four different AI models, including various versions of OpenAI's ChatGPT and Llama, a large language model created by Meta AI.
The researchers submitted 600 eighth-grade persuasive essays on topics like whether schools should require community service and whether aliens built a hill on Mars. They then resubmitted the same essays with different demographic labels attached to the student names, identifying writers as Black or White, male or female, driven or unmotivated, or as having a learning disability.
The results showed consistent patterns across all AI models tested. Essays attributed to Black students received more praise and encouragement, sometimes emphasizing leadership or power. One example of feedback given to essays labeled as written by Black students was: "Your personal story is powerful! Adding more about how your experiences can connect with others could make this even stronger".
In contrast, essays labeled as written by Hispanic students or English learners were more likely to trigger corrections about grammar and "proper" English. When the student was identified as White, the feedback more often focused on argument structure, evidence, and clarity, the kinds of comments that can push writers to strengthen their ideas.
How Do These Biases Affect Different Student Groups?
The study identified distinct patterns in how AI feedback varied by student demographics:
- Female Students: Received feedback that often used first-person pronouns and affective language, positioning the AI model as personally engaged with their work, with comments like "I love your confidence in expressing your opinion!" and "I appreciate your emphasis on respect."
- Black Students: Received disproportionate praise with less constructive criticism, with words like "powerful" appearing only for Black students in the analysis.
- English Language Learners: Faced intensely negative and corrective feedback focused on grammar rather than substantive writing improvement.
- Students with Learning Disabilities: Received more praise and less constructive critique compared to their counterparts without identified disabilities.
According to the analysis, students identified as Black, Hispanic, Asian, female, unmotivated, and learning-disabled received less constructive criticism and more praise overall, reflecting both feedback withholding and positive feedback biases. In some cases, praise took on overtly stereotyped forms, with words like "love" used disproportionately with female students.
Why Should Educators Care About These Biases?
The researchers emphasized that the problem isn't simply that feedback is positive or negative, but that biased feedback prevents students from growing as writers. Feedback being positive does not mean it's high-quality feedback.
"Our concern is not that feedback should be standardized for every student. Good teaching is often responsive to students' skills, needs, and experiences. Feedback being positive does not mean it's high-quality. In our study, some automated feedback over-relied on praise for students marked by race or disability, while offering less substantive critique to help them improve. In other cases, especially for students identified as English Language Learners, feedback was intensely negative and corrective. Both can deny students meaningful opportunities to revise and grow as writers," stated Mei Tan and Lena Phalen.
Mei Tan and Lena Phalen, Researchers at Stanford University
This distinction matters because students need specific, actionable feedback to improve their writing skills. Over-praising without constructive criticism leaves students without a roadmap for improvement, while overly corrective feedback focused on grammar rather than ideas can discourage engagement with the writing process itself.
Steps to Identify and Address AI Bias in Educational Tools
- Audit Feedback Patterns: Schools using AI writing tools should test the same student work with different demographic identifiers to see if feedback varies by race, gender, or other characteristics.
- Compare Feedback Quality: Evaluate whether AI feedback provides substantive critique that helps students improve their arguments and ideas, not just praise or grammar corrections.
- Monitor for Stereotype Reinforcement: Watch for language patterns that may reinforce stereotypes, such as using emotionally engaged language only for certain groups or emphasizing "power" only for specific demographics.
What Do Researchers Think Is Causing These Biases?
The exact source of these biases remains unclear because large language model training procedures are proprietary. However, researchers have some theories. Research has observed positive feedback bias and feedback withholding bias in human feedback, suggesting that AI models may be learning these patterns from their training data.
Tan and Phalen noted that bias mitigation mechanisms used during the training of large language models may actually introduce some of the positive stereotypes observed in the study. This suggests that even well-intentioned efforts to reduce bias in AI systems can sometimes create new, unexpected problems.
The Stanford study highlights a critical challenge as schools increasingly adopt AI tools for personalized learning and feedback. While these systems promise to scale education and provide individualized support, they can inadvertently reinforce existing inequities if their biases go unexamined. As AI systems like Meta's Llama and ChatGPT become more prevalent in educational settings, understanding and addressing these hidden biases becomes essential for ensuring equitable learning opportunities for all students.