Envoy TODO

Essential (Do Next)

☐ Add email attachment parsing and processing - Parse attachments from multipart emails - Extract text from common formats (txt, pdf, docx) - Include attachment content in LLM context - Store attachments for reference - Add attachment info to schema

High Priority

☐ Auto-start gdata-server (notes server) if not running - notes client (~/bin/notes) currently fails with connection refused if server is down - On connection failure, client should attempt to start the server automatically - Start command: cd ~/py/gdata-server && GDBM_PATH=.agent_notes.gdbm uvicorn gdata_server:app --port=8021 - Wait briefly and retry before giving up - May need a wrapper script or modification to notes_client.py

☐ Remove hardcoded 20 result limit on email searches - Let LLM decide result limit via schema - Add ability to check message sizes before fetching - Prevents context overflow from large emails

☐ Progressive email summarization for large result sets - When many emails need processing, summarize in batches - Write intermediate summaries to notes - Final iteration reads summary notes instead of all emails - Optimizes context window usage for large email volumes - IMPORTANT: Summaries must include Message-IDs for reference back to originals

☐ Aggregate search results across multiple folders - Allow searching multiple folders in one request - Report per-folder counts (e.g. "Found 11 emails today: 4 in INBOX, 7 in Done")

Current Limitations (To Address)

⚠️ Hardcoded 20 result limit in search_emails_imap()⚠️ Cannot check email sizes before fetching⚠️ No strategy for processing 100+ emails efficiently⚠️ Cannot search multiple folders in a single request⚠️ Orchestrator does not return Message-ID of sent emails to the LLM — Envoy cannot track its own outgoing threads or correlate replies. Workaround: search Sent folder by subject/date/recipient.⚠️ Current cron loop processes one email per run — no multi-message inbox loop yet

Completed

✅ Phase owns the model — removed next_model from LLM/schema; phase notes declare model alias; orchestrator resolves and uses it (2026-02-28)✅ Anti-loop: idle_count + dynamic available-phases list — idle_count in continuation JSON, phase degradation/removal at thresholds, force_next override at IDLE_FORCE_THRESHOLD (2026-02-28)✅ next_phases, force_next, one_shot added to all envoy/states/* phase notes (2026-02-28)✅ attempted_searches tracking — persisted in continuation JSON, surfaced to LLM (2026-02-28)✅ _parse_note_value() — repairs LLM double-encoded quotes before note write (2026-02-28)✅ EMAIL_BODY_WARN_CHARS guard — warns on oversized email bodies (2026-02-28)✅ Phase-Aware Processing State Machine — 9 phases, dispatch_immediate_actions, state notes, context bundles (2026-02-22)✅ send_emails dispatches on all non-terminal phases (not just complete) (2026-02-22)✅ Double-send prevention via sent_to_addresses tracking (2026-02-22)✅ Duplicate notes dedup in context (already_loaded_keys) (2026-02-22)✅ Multi-folder move-email search: INBOX then Active (2026-02-22)✅ Sent IMAP copy for continuation emails — enables test resume workflow (2026-02-22)✅ Test harness test_orchestrator.py: inject, resume, mark-unseen, list, run (2026-02-22)✅ Support for multiple LLM models (nano/mini/full tiers) — gpt-4.1-nano / gpt-4.1-mini / gpt-5.2-chat-latest (2026-02-20)✅ Smart model selection — LLM hints via next_model field, orchestrator selects tier per iteration (2026-02-20)✅ Continuation email support: JSON attachment, auto-resume, total iteration limit of 24 (2026-02-13)✅ Multi-mailbox support: folder listing, per-ID folder hints, cross-folder fetch (2026-02-13)✅ Add list_folders action to see available IMAP folders (2026-02-13)✅ Add folder parameter to search_emails schema (2026-02-12)✅ Allow LLM to search any folder (2026-02-12)✅ Implement iterative processing loop (2026-02-12)✅ Add CONTENTS note for discovery (2026-02-12)✅ Add iteration tracking to audit reports (2026-02-12)✅ Fix duplicate confirmation emails (2026-02-12)✅ Add configurable context notes and mailbox selection (2026-02-12)

Future

Arbitrary Phase Names

Change status from enum to free string. Orchestrator accepts any phase name,looks up envoy/states/{name}, warns but does not crash if not found.Enables new phases (simple-summary, power-summary, fact-checking, translating)by creating a note — no code change required.complete and escalate remain the only hardcoded terminal states.See envoy/design-phases-v2 §7 (item 5).

External Specialist Agent Endpoints

Envoy can act as an orchestrator for specialist sub-agents implemented as email addresses. Examples: websearch@critchley.biz, summarise@critchley.biz, code@critchley.biz. Envoy sends a task email to the endpoint (starting a fresh Message-ID thread), the sub-agent processes it and replies with results, and Envoy incorporates the reply into its current working context. This is the email-native equivalent of a tool call — async, decoupled, and composable. Each sub-agent is itself an orchestrator instance running against a dedicated mailbox with specialist instructions.Key requirements:• Each sub-agent request starts a fresh Message-ID chain (not a reply to the originating user thread)• Envoy must track which sub-agent replies belong to which parent task (via working_note)• Sub-agent endpoints can themselves call further sub-agents, forming a pipeline• Results are incorporated by Envoy fetching the reply email and continuing its gathering phaseImplementation: new mailboxes + dedicated envoy/start notes per endpoint.send_emails now dispatches on all non-terminal phases, so sub-agent emails can be sent during gathering.

Cross-Thread Semantic Matching (Operational — Already Documented)

The matching and introduction behaviour is documented as current operational guidance in Cross-Thread Awareness. This is not a future item — Envoy should be doing it now using search_emails. Future work in this area: persistent 'open requests' notes to avoid searching from scratch on every email, and confidence scoring to avoid spurious introductions.

Design Principle: MUA Parity

Envoy should work the way a human works in their MUA (e.g. Thunderbird). Its action vocabulary should map directly to the actions available in a real mail client. This matters because: - It makes the mental model obvious — if a human can do it, Envoy should be able to do it - It enables natural secretary-style instructions: "star important messages", "review starred messages once a day and note that you have done so", "file newsletters into a digest folder" - Instructions do not need to reference technical IMAP concepts; they can be written as you would explain a task to a person

Current MUA action coverage:✅ Read email body (fetch)✅ Move to folder✅ Delete✅ Reply / send new email✅ Search by sender, subject, date, flags✅ Read star/flag status in search results❌ Star / unstar a message (set or clear \Flagged)❌ Mark as unread (clear \Seen on a message already read)❌ Copy to folder (without removing from source)❌ Labels / tags (IMAP custom flags, e.g. Thunderbird tags)❌ Snooze / remind later (no native IMAP — would need notes-based reminder)❌ Forward (possible via send_emails but no explicit schema action)

Most impactful gaps to close first: star/unstar and mark-unread, as these are the everyday triage actions.

Multi-message inbox loop

Currently each cron run processes one email. The intended design is a loop that drains the inbox, with the LLM deciding which messages to consume together and which to leave for separate processing.Pseudocode for one cron run: scan INBOX for UNSEEN messages while there are UNSEEN messages: pick oldest UNSEEN message (primary message) process it (LLM iterations) if processing needs other INBOX messages: fetch them (reading marks them \Seen automatically — this is correct default behaviour) if they need separate processing: mark them UNSEEN again move all SEEN messages out of INBOX go back to scanningThis mirrors the human workflow: opening an email marks it read; if you decide it needs separate attention you explicitly mark it unread again. No change to the fetch mechanism needed.Key implementation changes: - Add mark_unseen: List[str] to the schema — Message-IDs the LLM wants left for separate processing - Orchestrator applies STORE -FLAGS \Seen for each ID in mark_unseen after processing - Outer loop re-scans INBOX after each email is done and continues until no UNSEEN remain

Other future items

☐ Add web interface for viewing processing history☐ Add cost tracking for LLM API calls

version12
updated2026-03-06