WebBrain Brings Local AI Browser Automation to Chrome and Firefox, No Subscription Required
WebBrain is a free, MIT-licensed browser extension for Chrome and Firefox that automates web tasks using local AI models like Ollama, ensuring no page data leaves your machine and requiring no subscription. Built by Emre Sokullu, the tool bridges the gap between cloud-based AI assistants and truly private, self-hosted alternatives by supporting local models including Ollama, llama.cpp, and LM Studio alongside cloud APIs.
What Makes WebBrain Different From Other Browser AI Tools?
Most browser AI plugins require cloud connectivity and send your page data to remote servers. WebBrain flips that model. When you run it against a local model, nothing leaves your machine. The extension lives in your browser's side panel and operates within your existing authenticated session, meaning it sees your logged-in accounts exactly as you do while storing no data externally and adding no telemetry or tracking accounts.
The tool operates in two distinct modes. Ask mode is read-only and cannot change pages, making it safe for research and data extraction. Act mode can click, type, scroll, navigate, and run multi-step workflows by using the Chrome DevTools Protocol, which produces trusted input events that modern websites actually honor. This approach also reaches cross-origin iframes and shadow DOM elements that ordinary browser scripts cannot access.
How to Set Up WebBrain With Local Models?
- Ollama Setup: Set the environment variable OLLAMA_ORIGINS="*" and run ollama serve, then configure the base URL to http://localhost:11434/v1 in WebBrain settings for OpenAI-compatible access.
- llama.cpp Configuration: Load a model with at least a 16,000-token context window using llama-server with your model file and specify port 8080 for the extension to connect.
- Cross-Machine Servers: For vLLM or other remote servers, enable CORS with the appropriate allowed-origins flag to allow WebBrain to communicate securely across your network.
The recommended model is Qwen 3.6 35B, which outperformed Gemma 4 on the project's screenshot benchmark. An RTX 5090 graphics card is ideal; an RTX 4090 works with INT4 AutoRound quantization for reduced memory requirements.
What Real-World Tasks Can WebBrain Automate?
The extension handles several practical workflows. Data extraction lets you open a product catalog and ask the agent to extract all product names and prices, returning structured rows. Research summaries allow you to ask WebBrain to summarize articles and answer follow-up questions, with the tool intelligently detecting paywalls and dismissing cookie-consent banners before reading. Form filling suits repetitive signups using an optional Profile auto-fill that stores a short biography in local plaintext, which the LLM uses to complete low-stakes forms. Multi-step automation spans complex tasks like navigating to GitHub and finding trending repositories by chaining navigation, reading, and clicking actions.
Security is baked into the design. WebBrain starts in read-only Ask mode and asks for approval before consequential actions, though you can disable these prompts in settings. For anything that creates, sends, submits, or buys, the tool uses visible UI elements rather than calling REST or GraphQL endpoints directly, preventing hidden API calls that could bypass your intent.
How Does WebBrain Control Costs When Using Cloud Models?
Cloud tokens add up quickly on long sessions, so WebBrain bounds costs in three ways. Screenshots are resized and iteratively compressed as JPEG files before leaving your machine, keeping image tokens small. Conversation history and tool outputs are trimmed oldest-first as the context window fills. You can also pair a cheap text model for planning with a separate vision model for screenshots, optimizing cost per task.
For pricing, WebBrain offers a free forever option when running entirely on local models with no API costs. The managed WebBrain Cloud option costs five dollars per month per device profile under a fair-use policy, eliminating the need for local setup. For local use, llama.cpp requires no API key at all.
How Does WebBrain Compare to Competing Browser AI Tools?
WebBrain sits between lightweight browser plugins and full developer frameworks. Unlike Claude in Chrome, which requires a Claude Pro subscription and only works with Anthropic's models, WebBrain is free forever and supports any OpenAI-compatible endpoint. It also offers full offline capability with local models, whereas Claude in Chrome requires cloud connectivity. WebBrain supports multiple providers including OpenAI, Anthropic Claude, Gemini, Mistral, DeepSeek, xAI Grok, Groq, MiniMax, Alibaba Cloud Qwen, Nvidia NIM, and OpenRouter, giving users flexibility across the AI ecosystem.
The extension is available on the Chrome Web Store, Firefox Add-ons, and GitHub. It ships in English, Spanish, French, Turkish, and Chinese, with auto-detection of your browser language on first launch.
WebBrain represents a practical answer to the privacy-versus-capability tradeoff that has defined consumer AI tools. By supporting both local and cloud models through a single interface, it lets users choose their own balance between data privacy and AI performance, all without requiring technical expertise in model deployment or API management.
Disclosure: The WebBrain team supported this article's development and promotion.