GitHub's Accessibility Agent Is Catching Code Problems Before They Ship,Here's How It Works
GitHub has deployed an experimental accessibility agent that automatically reviews pull requests and catches common accessibility barriers before code goes live. The agent has reviewed 3,535 pull requests so far, with a 68% resolution rate for issues it identifies. Rather than attempting to solve accessibility in isolation, the agent augments human engineers' efforts by removing friction that would otherwise inhibit use of GitHub's platform for people who rely on assistive technology.
What Accessibility Problems Is the Agent Actually Catching?
The accessibility agent focuses on objective, straightforward issues that can be reliably detected and fixed automatically. According to GitHub's experience, the top five categories of problems the agent encounters are:
- Structure and Relationships: Making structure and relationships clear to assistive technologies so screen readers and other tools can properly interpret the interface
- Control Labels: Providing clear and concise names for interactive controls like buttons and form fields
- Announcements: Ensuring users are aware of important announcements and status updates
- Text Alternatives: Ensuring there are text alternatives for non-text content like images and icons
- Keyboard Navigation: Moving keyboard focus through pages and views in a logical order for users who cannot use a mouse
Each of these issue types represents a barrier that would have otherwise inhibited use of GitHub for people using assistive technology. The agent's ability to catch these issues automatically means they never reach production in the first place.
How Did GitHub Build an Agent That Actually Works?
GitHub's approach to building the accessibility agent reveals important lessons about specialized AI agents. The team discovered that a single monolithic agent quickly became inefficient, consuming excessive tokens and producing unreliable output. Instead, they evolved the system to use a sub-agent architecture with two dedicated agents working in parallel.
The first sub-agent acts as a passive reviewer and researcher, identifying accessibility issues in code. The second sub-agent acts as an active implementer, proposing fixes. Critically, these two sub-agents do not directly communicate with each other. Instead, they generate structured, templated output that flows through a parent orchestrating agent, which validates and routes the information. This design choice may seem inefficient, but it solves several real problems:
- Escalation Checkpoints: The reviewer agent checks for areas where human intervention will likely be needed, including multiple high-severity accessibility failures and patterns known to be difficult to fix automatically
- Complexity-Based Behavior: If underlying code is deemed too complicated, the agent operates in guidance-only mode rather than attempting automatic fixes, reducing errors and token waste
- Filtering and Relevance: The parent agent determines what findings are relevant to the request, preventing the implementer from pursuing irrelevant and counter-productive tasks
- Traceability and Audit Trails: Direct communication between sub-agents would remove the ability to create and review an audit trail of user and agent decisions, which is essential given the contextual nature of accessibility work
This architecture directly addresses a fundamental challenge with large language models (LLMs), which are AI systems trained on vast amounts of text data. LLMs have an unfortunate bias toward producing accessibility antipatterns because they were trained on decades of inaccessible code. To counteract this, the agent needs better content to draw from.
Why Does GitHub's Existing Accessibility Data Matter So Much?
GitHub had a significant advantage when building this agent: a mature system for logging accessibility issues that predated the explosion in popularity of LLM tooling. The company maintains a structured template for reporting problems, including steps to reproduce the issue, metadata about severity level and applicable accessibility standards, crosslinks to pull requests that addressed issues, and acceptance criteria. All issues are centralized in a single repository.
This highly consistent and structured corpus of content became one of the agent's strongest assets. GitHub instructed the agent to investigate these historical issues and extrapolate related code and language snippets from them. The non-deterministic "fuzzy matching" behavior of LLMs, which can be a liability in other contexts, actually acts as an asset here, allowing the agent to find conceptually similar patterns across the codebase.
"I enthusiastically recommend investing in manually cataloging and remediating accessibility issues. After some progress, this data can be incorporated into the agent," stated Eric Bailey, who led the project at GitHub.
Eric Bailey, GitHub
The implication is clear: organizations that have not already invested in manually identifying and remediating accessibility issues will be at a disadvantage. The European Accessibility Act is now in effect, and Title II of the Americans with Disabilities Act is set to establish meeting WCAG 2.1 AA as the legal definition of accessibility compliance in April 2027. LLM agents can read and take action on the accessibility tree, making them increasingly relevant to regulatory compliance.
What Are the Practical Implications for Other Teams?
GitHub's experiment reveals that building effective specialized agents requires more than just pointing an LLM at a problem. Vague instructions like "use accessibility best practices" with a short list of examples will not work well. The agent needs highly contextual examples written using the organization's own conventions. This is why GitHub's investment in structured issue logging became so valuable.
The accessibility agent is not positioned as a "silver bullet" that can automatically address every hypothetical scenario. Understanding and socializing this limitation actually sped up the experiment's launch and led to more buy-in for the effort. Accessibility is a holistic concern that intersects with code, design, copywriting, and numerous other disciplines involved with creating user interfaces. A lot of accessibility work is also highly contextual, meaning someone typically needs the full working picture of a problem before they can give appropriate advice.
As AI agents become more prevalent in software development workflows, GitHub's experience with the accessibility agent offers a template for building specialized agents that work reliably. The key is not to expect the agent to solve the entire problem, but rather to augment human expertise by removing the most straightforward barriers and escalating complex cases for human review.