Envoy Code Overview

Developer reference for the Envoy codebase at ~/py/envoy/. For architecture rationale see Design Specification; for IMAP detail see IMAP Client; for orchestrator internals see Orchestrator Implementation.

File Map

orchestrator.py — main processing loop, LLM calls, action execution, continuation emails• imap_client.py — thin imaplib wrapper with .netrc auth (context manager)• envoy_schema.py — Pydantic models for structured LLM output (strict mode)• new_envoy_response_schema.json — inlined JSON Schema for OpenAI API (no $refs)• test_orchestrator.py — test harness: inject emails, resume continuations, run orchestrator• webdav_deliver.py — alternative email delivery via WebDAV maildir• inline_schema_refs.py — utility: inline all $ref in a JSON Schema• diag_list_emails.py — diagnostic: list folders/emails with read status• send_test_email.py — utility: inject test emails via sendmail• _orchestrator.py — legacy single-pass version (kept for reference)

Phase-Aware State Machine

The orchestrator implements a finite state machine with 9 phases. The LLM sets status each iteration to signal what phase it is in.

Non-terminal phases (loop continues, immediate actions dispatched):• triage — initial assessment; decide what to do• gathering — fetch notes, emails, search results• summarising — condense gathered material into notes• working — general reasoning/analysis• coding — write or review code (uses full model tier by default)• composing — draft emails or documents• waiting — send an intermediate email, then send continuation and stopTerminal phases (loop exits, terminal actions also dispatched):• complete — task done; execute move/delete/confirmation• escalate — cannot proceed; log and stop

Each iteration, the orchestrator auto-loads the note envoy/states/{current_phase} to inject per-state instructions for the LLM.

LLM Response Contract

All fields are required (OpenAI strict mode). The LLM returns empty arrays/strings for unused fields. Defined in envoy_schema.py and serialised as new_envoy_response_schema.json.

Gathering fields (fetch additional context):• add_notes — list of note keys to fetch• add_emails — list of {message_id, folder} to fetch (folder is a hint)• search_emails — list of searches, one per folder (returns headers only)• list_folders — boolean; refreshes available IMAP folder list• drop — keys or Message-IDs to remove from gathered context

Immediate action fields (executed on every non-terminal phase):• write_notes — list of {key, value} — create/update notes• delete_notes — list of note keys to delete• send_emails — list of {to, subject, body, in_reply_to}

Terminal action fields (executed only on status='complete'):• move_emails — list of {message_id, folder}• delete_emails — list of Message-IDs

Control fields:• status — phase (see Phase-Aware State Machine above)• bundle_key — note key for this task's context bundle (auto-loaded each iteration)• working_note — orientation context carried forward between iterations• reasoning — explanation included in confirmation email and log• next_model — intended tier 'nano'/'mini'/'full' for next iteration

Processing Flow

Each invocation processes all UNSEEN emails in the target folder (default INBOX). For each email, an iteration loop runs up to max_continuous_iterations times (default 8, hard total limit 24 across continuations).

main()
  load start notes + CONTENTS
  open single IMAP connection
  list folders, ensure Sent exists
  expunge deleted messages

  for each UNSEEN email:
    if continuation: restore working_note, bundle_key, gathered_notes/emails, failed_fetches

    loop (up to max_continuous_iterations):
      auto-load envoy/states/{current_phase} note
      if bundle_key set: auto-load bundle notes
      call LLM → EnvoyResponse

      dispatch_immediate_actions():
        write_notes, delete_notes → apply immediately
        send_emails → send immediately, track sent_to_addresses

      if non-terminal phase (triage/gathering/summarising/working/coding/composing):
        fetch notes, emails, searches → gather_results
        update working_note, search_results
        if at iteration limit: send continuation email, break

      if 'waiting':
        send continuation email + stop (LLM already sent interim email above)
        break

      if 'complete':
        execute move_emails, delete_emails
        send confirmation reply if LLM didn't reply directly (checked via sent_to_addresses)
        break

      if 'escalate':
        log and break

    move email (and continuation if any) to Done

Action Dispatch Architecture

dispatch_immediate_actions(client, response, gather_results, sent_to_addresses) runs on every iteration for every non-terminal phase. It tracks which addresses emails were sent to via the sent_to_addresses set. This set is passed to execute_actions() as prior_sent_to so the final confirmation reply is skipped if the LLM already replied directly.

Context Bundles

When the LLM sets bundle_key to a note key, the orchestrator reads that note and auto-loads any note keys listed inside it. This lets the LLM create a 'bundle document' that acts as an index of all relevant context for a task, then simply point to it — avoiding repeated add_notes requests across iterations.

The LLM typically creates a bundle note during an early phase and sets bundle_key to it. The orchestrator then loads it automatically each iteration for the rest of the task. Bundle key is checkpointed in continuation emails.

Gather Results Feedback

After each gathering step the orchestrator builds a gather_results list and passes it to the next LLM call as === RESULTS FROM PREVIOUS ITERATION ===. This is the mechanism that lets the LLM know what actually happened — e.g. whether a note was found, how many search results came back, or that a fetch failed.

fetch_note('envoy/start'): OK
fetch_note('missing-key'): NOT FOUND
search_emails(folder='Done', subject='count'): found 5 email(s)
fetch_email('<mid@host>'): OK (folder: Done)
list_folders(): john, DAV, Sent, Done, INBOX

Email Search (Two-Stage Pattern)

See Email Access Guide for the full reference. In brief:

search_emails is a list — multiple folders can be searched per iteration• Each search returns (date, subject, from, to, cc, message_id, in_reply_to, read, starred) — no body• Results appear in the next LLM call as === EMAIL SEARCH RESULTS === with a clear 'headers only' label• The LLM uses add_emails in a subsequent iteration to fetch bodies for specific messages• search_flags filters by IMAP flags; space-separated for AND logic: e.g. "UNSEEN FLAGGED"• Results are capped at 20 per search; reset each iteration (not accumulated)

Implemented in search_emails_imap() using BODY.PEEK[HEADER.FIELDS (...)] — does not set the \Seen flag.

Continuation Email Mechanism

See Continuation Emails for full detail. Key points:

• Triggered when the per-run iteration limit is hit (or status='waiting')• Sent to envoy@critchley.biz as a self-addressed email, plus stored in Sent IMAP folder• Contains a human-readable body and a continuation.json attachment• The JSON stores: original Message-ID, working_note, bundle_key, current_phase, note keys, email refs, failed_fetches• On resumption, notes and emails are re-fetched fresh from their sources• Continuation email itself is moved to Done after processing

Email Dispatch

Outbound email routing is controlled by the EMAIL_DISPATCH table in orchestrator.py:

EMAIL_DISPATCH = {
    'john.critchley@bluewin.ch': (deliver_via_mail, 'john@critchley.biz'),
    'jsr_critchley@hotmail.com': (_webdav_delivery.deliver, ...),
}
# Default: sendmail to the address as-is

Hotmail addresses use WebDAV delivery (direct SMTP is blocked). Bluewin addresses are rewritten to the local john@critchley.biz alias and delivered via sendmail. All sent emails are copied to the Sent IMAP folder via client.append().

Failed Fetch Tracking

failed_fetches is a dict of id → attempt_count. After 2 failed attempts, the ID is listed as UNAVAILABLE in the LLM context and never requested again. This prevents infinite retry loops for genuinely missing documents. The dict is checkpointed in the continuation email and restored on resumption.

Schema Generation

The JSON schema used by the OpenAI API must have all $ref values inlined (OpenAI strict mode does not support references). Workflow:

1. Edit envoy_schema.py (Pydantic source of truth)2. Run generate_schema.pyenvoy_response_schema.yaml3. Run inline_schema_refs.pynew_envoy_response_schema.json4. The orchestrator reads the inlined JSON at startup

Key Constants

IMAP_HOST = 'imap' — credentials via ~/.netrc• MODEL_TIERS = {'nano': 'gpt-4.1-nano', 'mini': 'gpt-4.1-mini', 'full': 'gpt-5.2-chat-latest'}START_NOTE = 'envoy/start' — default start note• MAX_TOTAL_ITERATIONS = 24 — hard limit across all continuations• DEFAULT_FOLDER_ORDER — search order for email fetching by ID: INBOX, Done, Sent, then rest• EMAIL_BODY_WARN_CHARS = 4000 — log WARNING before sending email body longer than this

Note-Write Pipeline

All LLM-requested note writes pass through _parse_note_value(key, value) before reaching notes_client.write_doc():

1. Try json.loads(value) — succeeds for well-formed JSONHTL strings.

2. On failure, apply regex repair: replace \\" (double-backslash + bare quote) with \" (single-backslash + quote). This fixes a common LLM encoding error in codeblock bodies. Logs a WARNING if repair was needed.

3. If still invalid, raise ValueError — the write fails loudly rather than silently downgrading to an old-style text note.

See note-encoding issue for background on the encoding bug.

Approach Tracking

attempted_searches: list[str] records a human-readable description of every search_emails_imap() call made in the current run. Initialised to [] on fresh emails; restored from continuation.json on resume. Passed to call_llm() and surfaces as a === SEARCHES TRIED THIS RUN === block, instructing the LLM not to repeat them.

version3
updated2026-02-22