Chapter 5

Frontend, Backend, and Interfaces

Chapter 4 covered the engine — Python, the ~/.hermes directory, configuration, model providers, and the tool registry. Now we look at the steering wheel: how you actually interact with a running agent, which services stay on vs start on demand, and where your data lives.

Every system has a and a . Hermes just has more than one frontend to choose from, and its backend can run continuously or only when you need it.

Primary Interface

/: the primary interface

Type hermes and you enter a terminal-based chat — the primary way to interact with your agent. You type a message, the agent processes it, calls tools, and responds, all visible inline. Tool calls appear as collapsible blocks so you can follow what the agent is doing without drowning in detail.

The terminal is the natural home because Hermes is local-first: it runs on your machine, operates on your files, and executes commands in your environment. A browser would add distance between you and the work. You never need to leave the terminal to configure, run, or troubleshoot your agent.

Management commands are all reachable from the same place:

hermesStart a chat session
hermes setupConfigure model, provider, and tools
hermes modelChoose or switch AI model
hermes toolsView and configure toolsets
hermes gatewayStart the messaging gateway
hermes doctorDiagnose setup problems
hermes updateUpdate to the latest version

Communication

The

Not every interaction happens at the terminal. The gateway is a single background process that makes your agent reachable from 20+ platforms including Telegram, Slack, Discord, WhatsApp, Signal, and Email — full list in the Hermes docs. It is not a separate agent; it is the same agent, exposed through channels you already use. Quick questions from your phone, team-facing updates in a shared Slack channel, detailed work in the terminal — same agent, different entry points.

The gateway also handles voice memos on platforms that support them: speak a question, the agent transcribes, processes, and responds in text. It runs as a background service — systemd on Linux, launchd on macOS — configured automatically.

Web Interface

The API server: a dashboard when you want one

Hermes includes an opt-in . Enable it, point a compatible frontend like Open WebUI at http://localhost:8642/v1, and you have a browser-based chat dashboard. It is optional — the terminal and messaging platforms cover most interaction patterns — but useful when you prefer a browser or want a shared visual workspace for a team.

Enable it by adding to ~/.hermes/.env:

API_SERVER_ENABLED=true
API_SERVER_KEY=your-optional-key
API_SERVER_PORT=8642

Any OpenAI-compatible frontend — Open WebUI, LibreChat, or a custom dashboard — can then connect.

Operation Modes

Continuous vs on-demand: when the agent runs

Not every part of Hermes runs all the time. Understanding which services are continuous vs on-demand matters for planning your setup — and determines whether you need a VPS (Chapter 6) or can stay on a laptop.

On-demand

You start a session, give the agent a task, and interact until the work is done. Close the terminal and the agent stops. The agent is idle until you talk to it — like calling someone on the phone.

Covers: CLI/TUI chat sessions

Continuous

Services run in the background whether or not you are actively chatting — like a receptionist who stays at the desk all day. The gateway listens for incoming messages, the cron scheduler fires jobs at scheduled times, and the API server accepts web requests.

Covers: gateway (messaging), cron scheduler (scheduled jobs), API server (web dashboard)

Neither mode is better. On-demand is simpler and costs nothing in idle resources. Continuous is necessary for unattended work and multi-channel access. Most setups use both — on-demand for hands-on work, continuous for background services.

Data

Local vs remote: where your data lives

Everything Hermes stores — configuration, sessions, skills, memory, cron jobs, logs — lives in ~/.hermes on your machine. But some data flows to and from remote services:

Model providers

Every prompt and response passes through your chosen provider (OpenRouter, OpenAI, Anthropic, etc.). Your conversation content travels to their servers for the duration of the API call.

External memory providers

If you configure an external memory service like Honcho or Mem0, persistent facts live on that provider's infrastructure instead of (or in addition to) local MEMORY.md files.

Messaging platforms

When you chat through Telegram, Slack, or any messaging platform, those platforms handle and store messages according to their own policies.

The key distinction: your agent state — skills, memory, configuration, session history — is always local. Remote services process or relay data, but they do not own your agent's identity. Switch model providers and your skills and memory stay intact. Stop using a messaging platform and your session history remains in ~/.hermes.

All three interfaces — CLI, gateway, API server — talk to the same agent. There is no separate "Telegram agent" and "CLI agent." One agent, multiple front doors.

[Screenshot: A diagram showing the agent loop at the center, with CLI, gateway platforms, and API server as input channels, and ~/.hermes as the local data store. Model providers and external memory shown as remote services the agent calls out to.]

The continuous services — gateway and cron — are what eventually move people from a laptop to a VPS (Chapter 6). On a laptop, closing the lid stops the gateway. On a server, it runs around the clock. But that is a hosting decision, not an architecture one. The system works the same regardless of where it runs.

You want your agent to receive questions from your team on Slack and also run a weekly report every Monday at 9 AM. Which services need to run continuously, and which only need to run on demand?