Google Antigravity's Critical Security Flaw Exposed How AI Agents Can Be Hijacked Through Hidden Code
Google discovered and patched a critical security vulnerability in its Antigravity IDE that could allow attackers to execute arbitrary code on a developer's machine by exploiting how the AI agent processes file searches. The flaw, identified by Pillar Security researcher Dan Lisichkin, combined Antigravity's file-creation capabilities with insufficient input validation in the find_by_name tool to bypass Strict Mode, a security configuration that limits network access, prevents unauthorized file writes, and ensures all commands run in a sandboxed environment.
How Did the Antigravity Vulnerability Work?
The attack exploited a fundamental trust assumption in how Antigravity processes tool calls. When a developer or AI agent used the find_by_name tool to search for files, the system passed the search pattern directly to the underlying fd command without strict validation. An attacker could inject the -X (exec-batch) flag through the Pattern parameter, forcing fd to execute arbitrary binaries against workspace files.
The critical insight was timing: the find_by_name tool executed before Strict Mode constraints were enforced, meaning it was interpreted as a native tool invocation rather than a restricted operation. This created a complete attack chain that required no additional user interaction once the prompt injection landed.
"By injecting the -X (exec-batch) flag through the Pattern parameter in the find_by_name tool, an attacker can force fd to execute arbitrary binaries against workspace files. Combined with Antigravity's ability to create files as a permitted action, this enables a full attack chain: stage a malicious script, then trigger it through a seemingly legitimate search, all without additional user interaction once the prompt injection lands," explained Dan Lisichkin, researcher at Pillar Security.
Dan Lisichkin, Researcher at Pillar Security
The attack could be initiated in two ways. A direct approach required compromising a user's account, but the more dangerous method involved indirect prompt injection. An unsuspecting developer could pull a seemingly harmless file from an untrusted source that contained hidden attacker-controlled comments instructing the AI agent to stage and trigger the exploit.
Why Does This Matter for AI Agent Security?
Antigravity represents a new category of developer tools: AI-powered IDEs where autonomous agents can operate through the code editor, terminal, or browser. Google launched Antigravity in November 2025 as a public preview, positioning it as a way to "push the frontiers of how the model and the IDE can work together," according to Koray Kavukcuoglu, Chief AI Architect at Google. The IDE can build applications from a single prompt, create subtasks, execute them, and generate progress reports, all with minimal human oversight.
This autonomy creates a new security surface. Unlike traditional IDEs where a human developer reviews code before execution, Antigravity agents follow instructions from external content without human verification. The vulnerability exposed a critical gap in how these tools validate inputs before passing them to system commands.
- Autonomous Execution Risk: AI agents in Antigravity can execute commands without human review, making input validation failures catastrophic rather than merely inconvenient.
- Supply Chain Attack Vector: Attackers can embed malicious instructions in seemingly legitimate files, repositories, or comments that developers pull into their workspace.
- Sandbox Bypass: The vulnerability bypassed Strict Mode, the primary security mechanism designed to contain agent operations and prevent unauthorized system access.
Google addressed the vulnerability on February 28, 2026, following responsible disclosure on January 7, 2026. However, the incident revealed a broader pattern affecting multiple AI coding tools. Researchers have discovered similar vulnerabilities across Anthropic's Claude Code, GitHub Copilot Agent, and Cursor, each exploiting the same fundamental weakness: insufficient validation of untrusted input before tool execution.
Steps to Secure AI Agent Deployments
- Input Sanitization: Validate all parameters passed to system tools before execution, treating external input as untrusted even when it appears to come from legitimate sources.
- Execution Timing: Enforce security constraints before tool invocation, not after, to prevent tools from executing in a privileged context before restrictions apply.
- Untrusted Source Handling: Implement warnings or restrictions when agents process files from external repositories or untrusted sources, similar to how browsers handle downloads.
The broader implication extends beyond Antigravity. Lisichkin emphasized that "tools designed for constrained operations become attack vectors when their inputs are not strictly validated. The trust model underpinning security assumptions, that a human will catch something suspicious, does not hold when autonomous agents follow instructions from external content".
Lisichkin
Antigravity's May 2026 updates focused on expanding agent capabilities, including deeper integration with AgentKit 2.0, support for the A2A (Agent-to-Agent) protocol, and improved local LLM connectivity. These features make agents more powerful and interoperable, but they also increase the attack surface if security assumptions break down. The vulnerability patch demonstrates that as AI agents gain more autonomy, security validation must become proportionally more rigorous.
For developers using Antigravity or similar agentic IDEs, the lesson is clear: treat AI agents like you would treat any automated system with elevated privileges. Review code generated by agents before committing it, be cautious about pulling files from untrusted sources, and keep tools updated as security patches become available.