Self-Hosted AI Tools Like Ollama Face New Security Threats at Pwn2Own 2026
Self-hosted AI tools like Ollama, which let developers run large language models locally on their own hardware, are increasingly exposed to serious security threats. At Pwn2Own Berlin 2026, a major hacking competition held in May, security researchers discovered multiple vulnerabilities in popular local AI inference platforms, demonstrating that the gap between what these tools promise and what they can withstand remains dangerously wide.
Pwn2Own, organized by Trend Micro's Zero Day Initiative, invites security researchers to compete in finding vulnerabilities in widely used software. This year marked the second consecutive year the competition included AI-specific categories. Researchers competed across 13 possible targets and collectively earned just under $1.3 million in prize money, with the most lucrative categories reserved for Nvidia infrastructure and local inference tools like Ollama.
What Makes Ollama and Similar Tools Vulnerable?
Ollama allows users to self-host many AI models, including large language models such as Google's Gemma 4 or embeddings from Nomic, as long as the host has sufficient GPU and memory resources. The problem is that Ollama instances are frequently exposed on the internet, making them attractive targets for threat actors. At Pwn2Own, the Out Of Bounds team discovered two bugs in Ollama, one of which had not yet been patched. While many exposed Ollama instances can already be tampered with or used for inference, one of the discovered bugs would have allowed attackers to access the underlying host system itself, not just the model.
The security risks extend beyond Ollama. Nvidia's Container Toolkit, which enables Docker and Kubernetes containers to access Nvidia's GPUs for running high-performance tasks like large language model inference, also faced successful exploits. Two teams, Chompie and PWN2DACA, found vulnerabilities that could potentially grant access to the container itself or, worse, the host system. While attackers would typically need some existing access to the container environment to execute such attacks, chaining multiple exploits together is not uncommon in real-world scenarios.
How Are Security Researchers Using AI to Find These Flaws?
An interesting twist in this year's competition is that AI itself became both the hunter and the hunted. Every team used large language models in some part of their vulnerability discovery workflow. However, researchers reported high false positive rates during the discovery phase, consistent with traditional security research methods. The speed advantage of using AI tools, rather than their accuracy, proved to be what mattered most in the competition.
Agentic coding systems, which are AI tools that can write and modify code autonomously, were also targeted. Researchers found vulnerabilities in Anthropic's Claude Code, OpenAI's Codex, and Cursor. The bugs discovered across these three platforms stemmed from similar root causes: overpowered underlying developer tools and misplaced trust between agents and users. In some cases, the agents ask users to accept risks that the users may not be able to evaluate correctly.
Steps to Understand the Security Landscape for Local AI Tools
- Exposure Risk: Tools like Ollama and ChromaDB are widely exposed on the internet, making them attractive targets for attackers seeking to compromise AI infrastructure or gain access to underlying host systems.
- Root Cause Analysis: Many vulnerabilities in AI coding agents and inference tools trace back to overpowered developer frameworks and misguided trust models between autonomous agents and their users.
- Supply Chain Concerns: Similar code spreading across unrelated projects, combined with abusable developer tools and ongoing supply chain attacks, means the attack surface will continue to grow as software development accelerates.
The competition revealed a troubling pattern: as AI tools become more powerful and widely adopted, the security foundations underneath remain fragile. Tools designed for convenience and ease of use often sacrifice security considerations. LM Studio, for example, is similar to Ollama in hosting AI models and embeddings, but it features a more intuitive user interface and is typically kept local rather than exposed to the internet. However, because it is built as an Electron-based GUI application, it inherits many security issues that Electron applications already face. Security researchers found a variety of bugs in LM Studio during the competition.
"Tools like Ollama and ChromaDB are widely exposed on the internet, and successful exploits against them or against the Nvidia Container Toolkit could grant access to the underlying host, not just the model," noted Morton Swimmer, the author of Trend Micro's analysis of Pwn2Own Berlin 2026.
Morton Swimmer, Security Researcher at Trend Micro
Looking ahead, the security landscape for local AI tools is expected to become even more complex. Researchers identified emerging risks including "vibe coding," where similar code spreads across unrelated projects, and expanding supply chain vulnerabilities. As software development and bug discovery accelerate together, the attack surface for AI systems will only grow larger and messier.
For organizations and developers using self-hosted AI tools like Ollama, the takeaway is clear: local AI infrastructure requires the same security rigor as any internet-facing system. Keeping instances patched, monitoring for exposure, and understanding the security implications of your AI stack are no longer optional considerations but essential practices in the modern AI landscape.