The 30-Year-Old Shell Trick That Defeats Every Open-Source Coding Agent
A newly discovered vulnerability class called GuardFall exploits a fundamental mismatch between how AI coding agents check commands and how the shell actually executes them, leaving 10 of the 11 most popular open-source agents vulnerable to shell injection attacks. Researchers at Adversa AI found that agents like Aider, Cline, Goose, and OpenCode rely on pattern-matching filters that inspect raw command text, but the bash shell rewrites and expands that text before execution, allowing attackers to slip malicious commands past the guards.
The core problem is architectural, not a simple bug. When an AI agent processes untrusted content like a poisoned README file or a malicious npm package, a prompt injection can craft a command that passes all safety filters but executes destructively once bash processes it. The researchers surveyed 11 of the most-starred open-source coding agents on GitHub, collectively representing roughly 548,000 stars, and tested them against five classes of shell bypass techniques that have been documented in security literature for decades.
What Makes GuardFall So Dangerous?
AI coding agents run shell commands with full account authority, meaning they have access to SSH keys, cloud credentials, and everything stored in a user's home directory. Most agents gate this power behind a guard that matches the command string against a list of dangerous patterns. But the string being inspected is fundamentally different from the command that actually executes. A guard inspects raw text, while bash expands variables, removes quotes, and rewrites the command before running it.
The researchers identified five generic classes of shell bypasses that defeat any pattern-matching guard operating on raw text:
- Quote Removal: The shell strips adjacent quote pairs from a word before processing it. The string r''m looks like two tokens to a regex filter but becomes rm to bash after quote removal, bypassing guards anchored on rm.
- Variable Expansion: Bash splits fields on the Internal Field Separator after parameter expansion. The command rm$IFS-rf$IFS/ looks like one word to a regex but expands to three separate arguments: rm, -rf, and /, defeating filters looking for literal spaces.
- Command Substitution: The first token of a command can be a substitution like $(echo rm) -rf /, which yields rm, -rf, / after evaluation. A regex sees a substitution expression, not the binary name, so it passes the filter.
- Base64 Piping: Three benign commands joined by pipes (echo
| base64 -d | sh) each pass inspection individually, but the destructive content emerges from their composition. - Alternative Argument Shapes: Commands like find /x -delete, dd of=/dev/sda, and tar -C / -x achieve the same destructive effects but use different syntax that may evade pattern-based filters.
Which Agents Are Most Vulnerable?
The research revealed four distinct failure modes across the surveyed agents. Three agents, including Hermes, OpenCode, and Goose, ship a guard that exists but is defeated by the bypass classes. Two agents use a tokenized guard that leaks only on quoted substitution and destructive flags. The remaining agents either ship no static guard at all or rely on a sandbox that defends by default but fails under a documented local or auto opt-out mode.
In live end-to-end penetration tests using Claude Sonnet 4.6 as the underlying language model, the researchers confirmed that agents ingesting operational context like poisoned READMEs or Makefiles could be tricked into emitting injected commands as routine tasks. The chain of exploitation is model and framing dependent, meaning the vulnerability works across different AI models and different ways of prompting them.
Only one agent in the survey, Continue, implements a correctly designed guard that closes the structural majority of the bypass surface in its default IDE mode. Continue is the only sound defense for the configuration most developers actually run: an agent on the host machine with real home directory access and a non-disposable workspace.
How to Reduce Risk From AI Coding Agent Vulnerabilities
- Audit Your Agent's Guard: Check whether your coding agent uses a pattern-matching filter on raw command text or a more robust approach like tokenization or sandboxing. Pattern-matching alone is insufficient against shell expansion attacks.
- Isolate Agent Workspaces: Run AI agents in disposable containers or sandboxed environments where the workspace can be thrown away after each session, limiting the blast radius if a command injection occurs.
- Avoid Auto-Mode for Untrusted Input: Keep human-in-the-loop approval enabled when agents process external content like downloaded packages, README files, or configuration files from unknown sources.
- Monitor Shell Command Execution: Log and review the actual commands executed by your agents, not just the text they emit, to catch bypasses that pass filters but execute maliciously.
- Prefer Agents With Structural Defenses: Choose agents that implement tokenized command parsing or sandboxing by default rather than relying solely on denylist pattern matching.
The researchers emphasized that adding more patterns to a denylist does not solve the underlying problem. The vulnerability is not a bug in any single agent but a dangerous convention across the category. A filter that string-matches raw commands cannot model bash's expansion behavior, so it provides confidence without actual protection. That false confidence often leads teams to disable human approval and switch on auto-mode, making the vulnerability exploitable at scale.
The research was triggered by a discovery in the NousResearch/hermes-agent project, where an approval gate could be bypassed via shell rewrites against a 30-pattern regex denylist. This finding prompted a broader survey of the most popular open-source coding and computer use agents as of May 2026, all ranked by GitHub star count and community activity. The full set included sst/opencode, block/goose, cline/cline, RooCodeInc/Roo-Code, continuedev/continue, Aider-AI/aider, plandex-ai/plandex, OpenInterpreter/open-interpreter, All-Hands-AI/OpenHands, and SWE-agent/SWE-agent.
For developers using these agents in production environments, the findings suggest that relying on the model itself to refuse malicious prompts is insufficient. While frontier language models often refuse direct malicious instructions, they can be tricked into emitting injected commands when the request is framed as a routine operational task and embedded in legitimate-looking context. The vulnerability chain depends on both the model's behavior and how the request is framed, making it difficult to defend against through model training alone.