In earlier chapters, you saw the components of a Hermes agent: profiles, skills, memory, toolsets, gateway, and cron. But what actually happens when you send a message? How does a prompt become an action, and how does that action become a result you can use?
The answer is the agent loop: assemble context, call the model, pick a tool, execute it, feed the result back, and repeat. A chatbot generates text and stops. An agent generates text, acts on it, learns from the result, and keeps going until the task is done.
Understanding this loop matters because it determines what your agent can do, where things go wrong, and where you need to step in.
When you send a message to a Hermes agent — through the terminal, Slack, Telegram, or any other channel — this cycle begins:
A simple question completes in a single pass. A complex task might loop several times. In your SEO workflow, the research specialist runs this loop: search → filter → save → repeat until done.
When the model needs information it does not have — or needs to act beyond text — it returns a structured tool call instead of a plain response. Tools are organized into . You control which toolsets each profile can access — a research specialist gets web search and file writing; an SEO manager gets file reading and writing but not shell execution.
The model never executes tools directly — the Hermes runtime acts as the intermediary. It checks whether the requested tool is available to the current profile, passes the parameters, runs the tool, and returns the output. This separation is what makes approvals and security checks possible.
Not every tool call runs immediately. Dangerous actions — shell commands, file deletion, network requests to unfamiliar endpoints — go through the . Hermes has three modes:
The agent pauses and asks you to approve or deny. If you do not respond within 60 seconds, the action is denied. Start here.
An auxiliary LLM evaluates the command and decides. Fewer interruptions, but a second model call per approval — costs more and can occasionally allow a risky action a human would catch.
All dangerous commands run without review. A hardline blocklist still prevents the most destructive actions (like rm -rf /). Only use this if you fully trust the agent's tool access and context.
Approvals are part of a seven-layer defense-in-depth model — user authorization, dangerous command approval, container isolation, credential filtering, context file scanning, cross-session isolation, and input sanitization. You control how much autonomy the agent has; the default is the safest setting.
Every turn adds to the conversation history the model sees. As the session grows, history consumes more of the context window — the maximum text the model can process in one call. When the conversation exceeds a threshold, Hermes compresses earlier turns into a compact summary, preserving key decisions and results while discarding the full text. This is not memory (which persists across sessions in files like MEMORY.md). Session history exists within a single session and is compressed to stay within the model's context limits. When the session ends, the next session starts fresh — with memory and skills loaded, not the full prior conversation.
When a tool call fails — no search results, file not found, API error — the error message is injected into the conversation like a successful result. The model sees the error, adjusts its approach, and either retries or takes a different path. The loop keeps running. The agent has a maximum turns-per-session limit to prevent infinite retry loops, and if the model provider itself goes down, the loop cannot continue at all.
Error handling is built into the loop. The model does not crash when a tool fails — it reads the error, adds it to context, and decides what to do next. This is why an agent loop is more resilient than a single-shot prompt: it can recover from individual tool failures without starting over.
Understanding the agent loop changes how you design your agent team. The practical implications:
The agent loop is the engine. Your profile design, skill documents, tool access, and approval settings are the steering. The engine runs on its own — but you decide where it goes, what it can reach, and when it needs to stop and ask for directions.
Your research specialist runs a keyword search, gets 50 results, and calls the file writing tool to save all 50. The next step is drafting content — but you want to review the keywords first. Where in the agent loop would you insert a judgment gate, and how would you enforce it?
A terminal session with the research specialist:
$ hermes --profile research-specialist
You: Find keyword ideas for a blog post about AI agent teams
[Tool: web_search("AI agent teams keywords SEO")]
[Result: 10 results returned]
[Tool: web_search("best AI agent frameworks")]
[Result: 10 results returned]
Agent: Found 22 keywords. Saved top 20 to keywords.md.
Top 5:
1. "AI agent team setup" — high relevance
2. "multi-agent systems for SEO" — low competition
3. "how to build an AI agent team" — high volume
4. "AI agent workflow automation" — commercial intent
5. "agent teams vs single agent" — comparison angle
Waiting for your review before the SEO manager drafts.The loop ran multiple tool calls, injected results back, and finished with a summary — pausing at the judgment gate.