Nous Research Unlocks Non-Blocking AI Agents: Hermes Now Runs Subagents in the Background
Nous Research has shipped a major update to Hermes Agent that fundamentally changes how AI agents delegate work. The open-source personal agent now supports asynchronous subagents, meaning parent agents can spawn background tasks without blocking the main conversation. Previously, delegating work to child agents froze the entire chat until completion.
What Changed in Hermes Agent's Delegation System?
Until now, Hermes Agent's delegate_task tool worked synchronously. When a parent agent spawned a subagent to handle a task, the parent would wait inside the tool call until every child agent finished. Your chat stayed frozen during that wait. The new asynchronous delegation toolset, announced on X by Nous Research co-founder Teknium, eliminates this bottleneck entirely.
The update introduces a suite of new tools that give users real-time control over background work. Instead of blocking, the parent agent now receives a task ID immediately and can continue the conversation. Users can check status, inject new instructions, collect results, or cancel tasks without interrupting the main chat flow.
How to Use Asynchronous Subagents in Hermes Agent
- Spawn Background Work: Use delegate_task_async to spawn a background agent and receive a task ID immediately, keeping your chat free to continue other work.
- Monitor Progress: Run check_task to get non-blocking status updates plus recent output from any running task without pausing the parent conversation.
- Steer Mid-Flight: Use steer_task to inject a message into a running task, allowing you to adjust instructions or provide feedback while the agent is still working.
- Collect Results: Execute collect_task to block until a specific task completes, then retrieve the full result when you're ready to use it.
- Cancel or List: Run cancel_task to stop a running task, or list_tasks to see all async tasks in your current session.
Existing Hermes users can enable the feature by running hermes update. The background agents run as in-process threads, reusing the same AI Agent machinery, credentials, and toolsets as the original synchronous delegation system.
Why Does Non-Blocking Delegation Matter for AI Workflows?
The synchronous design prevented several practical workflows. You could not start a long-running agent task and keep working on something else. You could not check in on a run mid-flight or steer it toward a better outcome. You were locked in place until the child agents finished. The asynchronous approach unlocks new use cases where agents can run multiple parallel tasks while maintaining an active conversation with the user.
Each subagent starts with a completely fresh conversation, with no knowledge of the parent's history. The parent must pass everything through goal and context fields. This isolation keeps the parent's context window small, a critical constraint in large language models where every token costs compute and memory. Only the final summary returns to the parent; the child's intermediate tool calls and reasoning stay hidden.
Subagents inherit the parent's API key, provider configuration, and credential pool. That credential pool enables key rotation on rate limits, and users can route subagents to a cheaper model through configuration. This flexibility makes it possible to optimize cost and latency for different types of work.
What's Next for Hermes Agent Development?
The asynchronous delegation feature addresses a long-standing limitation in agent architecture. The implementation was built in the open through issue #5586, showing Nous Research's commitment to transparent development. While the current async subagents are single-session and do not persist across conversation turns, the roadmap includes a feature called ACP (#4949) that targets cross-turn durability, allowing tasks to survive across multiple chat sessions.
This update positions Hermes Agent as a more practical tool for real-world workflows where users need to juggle multiple long-running tasks. The ability to spawn background work, monitor it, adjust it, and collect results without freezing the main conversation mirrors how human teams delegate work in practice. As AI agents become more capable and more widely deployed, the ability to run multiple tasks in parallel while maintaining user control becomes increasingly important for productivity and user experience.