Chapter 4 covered what runs underneath — Python, the ~/.hermes directory, configuration files, model providers, and the tool registry. You know the engine. Now it is time to look at the steering wheel, the communication channels, and the dashboard.
Every software system has a (what you see and touch) and a (what does the work). Hermes is no different. The difference is that its "frontend" is not a single web page — it is a collection of interfaces you choose from, and its backend can run continuously or only when you need it.
This chapter maps every way you can interact with a Hermes agent, explains which services run all the time versus on demand, and clarifies where your data lives.
When you type hermes in your terminal, you enter a terminal-based chat interface. This is the primary way to interact with your agent — the equivalent of opening a ChatGPT window, but running entirely inside your terminal.
The TUI shows your conversation with the agent in real time. You type a message, the agent processes it, calls any tools it needs, and responds — all visible in the same terminal window. Tool calls, file edits, and web searches appear inline, so you can see what the agent is doing as it works.
Why is the terminal the primary interface instead of a web page? Because Hermes is a local-first tool. It runs on your machine, operates on your files, and executes commands in your environment. The terminal is the natural home for that kind of work — it is already where you manage files, run scripts, and control your system. A browser window would add a layer of distance between you and the work the agent is doing.
The CLI also gives you direct access to management commands:
Start a chat session with your agent
Run the setup wizard to configure model, provider, and tools
Choose or switch your AI model and provider
View and configure which toolsets are available
Start the messaging gateway for platform access
Diagnose problems with your setup
Update to the latest version
Everything you need to configure, run, and troubleshoot your agent is available from the terminal. You never have to open a browser or visit a dashboard to get work done.
Not every interaction with your agent happens at the terminal. You might want to send a quick message from your phone via Telegram, review a report in Slack, or approve a pending action from Discord. Hermes handles this through its gateway — a single background process that connects your agent to more than twenty messaging platforms simultaneously.
The gateway is not a separate install or a different agent. It is the same agent, made reachable through the channels you already use. When you send a message on Telegram, the gateway receives it, routes it to your agent, and delivers the response back to Telegram — as if you were chatting with a person on the other end.
This matters because it removes a common limitation of AI tools: being stuck in one interface. With the gateway, you choose how you interact based on context. Quick questions go through your phone. Detailed work happens in the terminal. Team-facing updates appear in a shared Slack channel. Same agent, different entry points.
The gateway also supports voice memos on platforms that allow them. You can speak a question, and the agent transcribes it, processes the request, and responds in text. This is useful when you are away from a keyboard and want to check on a scheduled job or approve a pending action.
For service management, the gateway runs as a background process. On Linux, it uses systemd. On macOS, it uses launchd. Both configurations are provided automatically — you do not need to write service files yourself.
Hermes includes an opt-in . When enabled, it runs on port 8642 and accepts the same request format that OpenAI uses. This means any web frontend designed for ChatGPT — like Open WebUI — can connect to your Hermes agent instead.
This is how you get a web dashboard. You enable the API server in your configuration, start a compatible frontend like Open WebUI, and point it at http://localhost:8642/v1. The frontend renders a chat interface in your browser, and your Hermes agent handles the requests behind the scenes.
Why is this optional? Because the terminal and messaging platforms cover most interaction patterns. A web dashboard adds value when you prefer a browser-based interface or when you want a team to share a single visual workspace. But for solo use, the CLI is typically sufficient.
Not every part of Hermes runs all the time. There are two operation modes, and understanding the difference is important for planning your setup.
You start a session, give the agent a task, and interact until the work is done. When you close the terminal or stop chatting, the agent stops too. This is how CLI sessions work by default. The agent is idle until you talk to it — like calling someone on the phone.
Services run in the background whether or not you are actively chatting. The gateway listens for incoming messages. The cron scheduler fires jobs at scheduled times. The API server accepts web requests. These services start when you launch them and keep running until you stop them — like a receptionist who stays at the desk all day.
The practical implication: if you only use the CLI, your agent runs when you run it. If you want messaging platform access, scheduled jobs, or a web dashboard, those services need to run continuously. On a laptop, that means the gateway and cron run while the machine is awake. On a VPS or server, they run all the time — which is why a VPS becomes the natural choice for continuous operation. Chapter 7 covers hosting decisions in detail.
Neither mode is better. On-demand is simpler and costs nothing in idle resources. Continuous is necessary for unattended work and multi-channel access. Most setups use both — on-demand for hands-on work, continuous for background services.
In Chapter 4, you saw that everything Hermes stores — configuration, sessions, skills, memory, cron jobs, logs — lives in the ~/.hermes directory on your machine. This is local storage. Your data stays on your hardware, under your control.
But some data flows to and from remote services:
Every prompt you send and every response you receive passes through your chosen model provider (OpenRouter, OpenAI, Anthropic, etc.). The provider processes your request on their infrastructure. Your conversation content travels to their servers for the duration of the API call.
If you configure an external memory service like Honcho or Mem0, your agent stores and retrieves persistent facts from that service. The memory data lives on the provider's infrastructure instead of (or in addition to) local MEMORY.md files.
When you chat through Telegram, Slack, or any messaging platform, those platforms handle and store the messages according to their own policies. Hermes sends and receives messages through their APIs, but the platform retains its own record of the conversation.
The key distinction: your agent state (skills, memory, configuration, session history) is always local. Remote services process or relay data, but they do not own your agent's identity. If you disconnect from a model provider and switch to another, your agent's skills and memory stay intact. If you stop using a messaging platform, your session history remains in ~/.hermes.
Putting it together: your Hermes agent has one brain (the agent loop that calls the model, uses tools, and manages memory) and multiple ways to reach that brain. The CLI is the direct line — always available, no setup beyond the base install. The gateway adds reach — your agent becomes accessible from the messaging platforms your team already uses. The API server adds a visual layer — a web dashboard for those who prefer it.
All three interfaces talk to the same agent. There is no separate "Telegram agent" and "CLI agent." One agent, multiple front doors. The messages you send on any channel feed into the same session history, the same memory, and the same skills.
[Screenshot: A diagram showing the agent loop at the center, with CLI, gateway platforms, and API server as input channels, and ~/.hermes as the local data store. Model providers and external memory shown as remote services the agent calls out to.]
The continuous services — gateway and cron — are the reason many people eventually move from a laptop to a VPS. On a laptop, closing the lid stops the gateway. On a server, it runs around the clock. But that is a hosting decision, not an architecture one. The system works the same way regardless of where it runs.
You want your agent to receive questions from your team on Slack and also run a weekly report every Monday at 9 AM. Which services need to run continuously, and which only need to run on demand?