🤖 Hermes Agent: The AI Agent That Grows With You

Nous Research's open-source answer to the ephemeral agent problem — closed learning loop, Skill Documents, 6 execution backends, and a Telegram/Discord gateway. The most serious open-source attempt at persistent agent memory to date.

March 10, 2026 · min read

📺 Watch the video version:

862

GitHub Stars · 3 weeks old

Execution Backends

Messaging Platforms

MIT

License · Open Source

"The agent that grows with you."

862 stars. 135 forks. Three weeks old. Nous Research — the lab behind the Hermes-3, Nomos, and Psyche model families — released Hermes Agent on February 26, 2026. It is not a new model. It is not a coding assistant wrapper. It is a self-hosted autonomous agent designed around a single thesis: capability should accumulate at the agent layer, not just the model layer.

The Ephemeral Agent Problem

Every mainstream AI agent today has the same fundamental flaw: it wakes up as a stranger.

You spend an hour explaining your codebase, your preferences, your deployment environment. The session ends. The next day, you start over. The agent doesn't remember that you prefer TypeScript over JavaScript, that your staging server lives at a specific address, or that the last three attempts to deploy via GitHub Actions required a specific environment variable. The context window is not memory — it's a whiteboard erased at the end of every session.

The industry has explored partial solutions. Long context windows help but add cost and latency proportional to history length. RAG from conversation logs retrieves what was said about a topic but not how to do it. Vector stores of prior conversations are useful but brittle without semantic structure. None of these address the core issue: synthesized, executable, improving procedural knowledge.

The gap: What people want from an AI agent isn't just task completion — it's a collaborator that accumulates shared context. Who knows your stack. Who remembers what failed. Who builds a working model of you as a developer over months, not minutes. That's what Hermes is designed to become.

What Hermes Actually Is

Hermes Agent is a self-hosted autonomous agent framework with four distinguishing architectural commitments:

🧠 Persistent Procedural Memory

Synthesized Skill Documents that capture how to do things, not just what was said.

🖥️ Persistent Machine Access

6 execution backends — runs on infrastructure, not just your laptop.

💬 Multi-Platform Gateway

Telegram, Discord, Slack, WhatsApp, Signal — one process, all platforms.

🎯 Hermes-3 + Atropos RL

Model trained specifically for tool-calling accuracy and long-range planning.

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash
hermes setup    # configure LLM provider
hermes          # start chatting
hermes gateway  # start messaging gateway (Telegram, Discord, etc.)

The Closed Learning Loop: From Tasks to Skills

The most technically interesting piece of Hermes is its skill system. When the agent completes a complex task — deploying a service, debugging a pipeline, running a data analysis workflow — it synthesizes the experience into a Skill Document: a structured markdown file capturing the procedure, the gotchas, the environment dependencies, and the decision logic.

These are not raw conversation logs. They are synthesized procedural artifacts — closer to a well-written runbook than a transcript. They follow the agentskills.io open standard, making them portable: any compatible agent can load and execute them.

The key distinction from RAG is procedural vs. declarative memory. A RAG system retrieves what was said about a topic. A Skill Document captures how to do the thing, structured for execution rather than reference. The next time a similar task appears, the agent queries its own skill library first using FTS5 full-text search. The skill improves with each use: edge cases get added, failure modes get documented, environment-specific notes accumulate.

Cross-Session Recall and User Modeling

FTS5 session search with LLM summarization indexes all past sessions so the agent can pull up relevant prior context semantically — "last time I tried this deployment..." — without you needing to remember or re-paste it.

Honcho user modeling (from Plastic Labs) builds a structured model of the user across sessions — preferences, working patterns, technical context, communication style. Updated through interaction, not a static profile. Over months of use, the agent develops a representation of you specifically.

The third layer is agent-curated memory nudges: the agent itself decides when something is worth persisting to long-term memory. Rather than saving everything (noisy) or nothing (current default), it exercises judgment about what constitutes durable knowledge.

Persistent Machine Access: Living on Infrastructure

A major friction point with AI agents is the "execution gap" — they write code but can't interact with the real world without manual intervention. Hermes closes this with persistent dedicated machine access via six backends:

Backend	Use Case	Notes
Local	Direct host machine access	Default for personal setups
Docker	Isolated, reproducible containers	Safe code execution without polluting host
SSH	Remote servers and cloud instances	Long-running tasks, background processes
Singularity	HPC container environments	Research clusters, academic compute
Daytona	Serverless persistence	Hibernates when idle, near-zero cost between sessions
Modal	Serverless scaling	Heavy workloads that need elastic compute

The practical implication: start a long-running EDA on a remote server via SSH, disconnect, return later. The agent maintains the terminal state, handles background processes, tracks file system changes independently. It isn't simulating a conversation — it's managing a workspace.

Daytona and Modal deserve special mention. They offer serverless persistence — your agent's environment hibernates when idle and wakes on demand, costing nearly nothing between sessions. This makes "run it on a $5 VPS" a genuinely viable deployment model rather than a marketing claim.

The Messaging Gateway

Most technical agents are confined to a CLI or web dashboard. Hermes prioritizes accessibility through a unified messaging gateway: Telegram, Discord, Slack, WhatsApp, Signal, and CLI — all from a single gateway process.

This enables a continuous feedback loop. Start a task at your workstation. Receive a completion notification via Telegram. Send follow-up instructions from your phone. The agent processes voice memos and executes within its persistent environment. Cross-platform conversation continuity means picking up on Discord where you left off on Telegram.

For engineers who want an agent that's always reachable without maintaining an always-open terminal, this is the right design. Talk to it from anywhere; it works on your infrastructure.

Hermes-3 and Atropos RL

Under the hood is a ReAct loop: Observe → Reason → Act. The agent reads terminal output or file contents, analyzes the current state against the goal, then executes a command or calls a tool.

This is powered by Hermes-3 (Llama 3.1-based), trained with Atropos RL — a reinforcement learning framework developed by Nous Research that specifically targets tool-calling accuracy and long-range planning. The training objective is not general chat quality; it's reliable execution across multi-step workflows without getting "lost." This specificity matters: a model optimized for tool use will outperform a general chat model on agentic tasks even if the general model scores higher on benchmarks.

That said, Hermes is fully model-agnostic. OpenRouter (200+ models), OpenAI, Nous Portal, z.ai/GLM, Kimi/Moonshot, MiniMax, or your own endpoint. Switch with hermes model — no code changes.

agentskills.io: The Open Standard Play

Hermes skills follow the agentskills.io open standard — portable SKILL.md files that any compatible agent can load and execute. The Skills Hub at agentskills.io provides a community repository of contributed skills.

This is a meaningful architectural bet. If agentskills.io gains traction as the common format, skills become ecosystem assets rather than proprietary lock-in. An agent trained on your specific workflows via Hermes could share those skills with other compatible agents in your stack. The open standard also means Nous Research can benefit from community-contributed skills without needing to build everything in-house.

Note for OpenClaw users: OpenClaw also uses the agentskills.io standard. Skills built for Hermes are portable to OpenClaw and vice versa — same SKILL.md format. The two systems can share skills directly.

Who Should Run Hermes

AI/ML engineers who want a persistent dev environment that accumulates knowledge about their specific stack
Self-hosters running workloads on VPS/cloud who want Telegram/Discord-native control without a web dashboard
Researchers interested in contributing training trajectories back to the Hermes-3 model family via Atropos RL
Teams who need an agent that maintains state across multi-day projects and long-running background tasks
Anyone frustrated by starting over every session

Limitations and Open Questions

Skill quality depends on model judgment. The agent decides what to synthesize into Skill Documents and when to persist knowledge. If the model's judgment is poor, you get low-quality skills or missed persistence opportunities. This is hard to evaluate from outside — it becomes apparent over weeks of use, not minutes.

Model ceiling: Hermes-3 is Llama 3.1-based. At the frontier, Claude 3.5/4, GPT-4o, and Gemini 2.0 likely outperform it on complex reasoning. Using Hermes with OpenRouter on a stronger model may be the right production choice for demanding tasks.
Self-improvement without oversight: Skills self-improve during use — but bad patterns can reinforce. A skill that contains an incorrect shortcut will keep getting used and refined in the wrong direction unless manually corrected.
Multi-user support: The current architecture appears single-user focused. Teams wanting shared agent memory and skill libraries will need to build their own coordination layer.
Serverless cold starts: Daytona/Modal serverless backends offer near-zero idle cost but introduce latency on wake. For latency-sensitive workflows, always-on VPS may be preferable.
Three weeks old: Production stability is unknown. The skill system in particular — agent-curated nudges, self-improvement — will take real usage data to validate at scale.

Conclusion

Hermes Agent is the most credible open-source attempt at solving the ephemeral agent problem to date. The architecture is coherent: Skill Documents address the procedural memory gap; persistent machine backends address the execution gap; the messaging gateway addresses the interface gap; Honcho addresses the user modeling gap. These are the right problems.

The bet Nous Research is making is that the next frontier in AI capability isn't just better models — it's better accumulation. An agent that compounds experience across sessions will outperform a smarter agent that starts fresh every time, at least for the kinds of long-horizon, environment-specific work that defines real engineering workflows.

That thesis is testable. Install it. Give it complex, recurring tasks over several weeks. Let it build skills. See whether the skill library actually improves your productivity or becomes noisy and stale. The answer will tell you a lot about whether this architectural direction is right.

Key Takeaways

1. Skill Documents are the core innovation — synthesized procedural memory that improves with use. Not chat logs. Not embeddings. Executable runbooks created from experience.

2. Runs on infrastructure, not your laptop — 6 backends including serverless Daytona/Modal. Start tasks, disconnect, get notified. The agent maintains state.

3. agentskills.io skills are portable — if you use OpenClaw, your skills are already in the right format. The ecosystems share a standard.

4. Three weeks old, use with that in mind — the architecture is sound; production stability and skill quality at scale are still being established.

🤖 Hermes Agent: The AI Agent That Grows With You

The Ephemeral Agent Problem

What Hermes Actually Is

The Closed Learning Loop: From Tasks to Skills

Cross-Session Recall and User Modeling

Persistent Machine Access: Living on Infrastructure

The Messaging Gateway

Hermes-3 and Atropos RL

agentskills.io: The Open Standard Play

Who Should Run Hermes

Limitations and Open Questions

Conclusion

Key Takeaways

References