A proposal for generalising Envoy's context architecture into a lightweight protocol applicable to all LLM interactions in this environment — Claude.ai, Claude Code, Codex, and future agents.
LLM sessions in this environment have unreliable context. There are three failure modes:
• Context absent — the agent starts cold, reads nothing, and makes assumptions or asks questions that notes already answer
• Context too broad — MEMORY is loaded but it is a summary of everything; the agent drowns in irrelevant material and instruction-following degrades (ETH Zurich 2026: bloated context files reduce task success rates)
• Context stale — notes exist but the agent does not know to fetch them, or fetches an old version
The current approach relies on discipline: the user preferences instruction says 'read MEMORY', and convention says 'read the relevant project note'. This works when followed but is not enforced — and crucially it still loads broad context rather than task-specific context.
Envoy solves this better because context loading is , not optional:
• The FSM phase determines which notes auto-load — the orchestrator enforces it, not the agent's discipline
• The RAG principle is built in: each phase signals the cognitive task, so the context matches what the agent is actually doing right now
• Context bundles carry thread-scoped context forward across stateless runs — state is serialised into the task, not held in memory
• Per-state instruction notes are fetched at transition time, not upfront — the right instructions arrive at the right moment
The result: Envoy rarely suffers from the three failure modes above, not because agents are more disciplined, but because the architecture makes the right behaviour the default.
Apply Envoy's phase model to interactive LLM sessions. A session has a cognitive phase — and the context loaded should match that phase. The phases for a coding/development session:
orient Starting a session. Load MEMORY, CONTENTS, and the relevant project note.
Goal: establish what exists, what the current state is, what conventions apply.
Notes to load: MEMORY, project/index (or nearest parent in CONTENTS)
gather Understanding a specific task. Load the notes directly relevant to this task.
Goal: have the exact context needed — no more.
Notes to load: task-specific (fetched explicitly by key)
plan Before writing code or making changes. State assumptions, define success criteria,
identify what will be touched.
Notes to load: whatever was gathered. No new fetches unless something is missing.
execute Writing code or making changes. Surgical. One thing at a time.
Notes to load: the specific file/note being changed + its immediate dependencies.
verify Check the output against the success criteria defined in plan.
Notes to load: same as execute. No new context.
commit Update notes, increment versions, link from parents, update MEMORY if needed.
Notes to load: whatever was changed.
For Claude.ai / Claude Code: sessions should open with an explicit orient step — fetch MEMORY and the relevant project note — before any task work. This is already the convention; the change is making it , not an optional recommendation.
For Claude Code specifically: a CLAUDE.md at project root can enforce this by instructing the agent to fetch the project note on session start. The project note is the bundle equivalent — it lists the notes relevant to that project.
For Envoy: already implemented. The proposal is to name and document what it does so the same vocabulary can be used across all agents.
For future agents (Codex, etc.): the protocol is tool-agnostic. Any agent with notes access can follow it. Agents without notes access (Codex today) are operating blind — this clarifies what they are missing and motivates solving the auth problem.
Envoy uses bundles to associate a set of notes with an ongoing thread. The same pattern applies to interactive sessions: a project has a bundle — a short note listing the notes that are always relevant when working on it.
Bundle note pattern: bundles/{project-slug} — already used by Envoy, could be adopted universally.
A CLAUDE.md at project root contains one instruction: 'fetch bundles/{project-slug} and load all listed notes before starting work'. This gives any agent the right context in one fetch.
The full protocol does not need to be built at once. The minimum viable version:
1. — document them here so agents can refer to them ('I am in gather phase') and humans can see what an agent is doing
2. — add to CLAUDE.md in each project: 'start every session by fetching bundles/{slug}'
3. — write a bundles/ note for each active project listing its relevant notes
4. — add to PROGRAMMING_RULES/ai-agent-guidelines: state assumptions and success criteria before writing any code
Step 1 and 4 are free. Steps 2 and 3 require creating a CLAUDE.md and a bundle note per project — perhaps an hour of work across the active projects.
• Agents without notes access (Codex) — they need the auth problem solved first
• Very long sessions where context fills up — summarisation (the Envoy summarising state) is the answer but not proposed here
• Notes that are stale or wrong — this protocol loads notes reliably; it cannot fix bad notes
• Envoy Phase Design — the source architecture this generalises
• AI Agent Guidelines — behavioural rules; this proposal provides the context architecture those rules operate within
• MEMORY — the orient step always reads this first