Thread Chain & Email Pool Design

Implementation: envoy/thread-chain-impl

Design for automatic thread chain building, in-memory email cache, unified Quick-ID pool, and 〶 section headers. Not yet implemented — approved design, ready to code.

Motivation

Each email has one In-Reply-To (direct parent) and optionally a References header (full ancestor chain). Envoy currently shows In-Reply-To in context but does not walk the chain automatically. The LLM must request ancestors explicitly via add_emails, costing iterations. This design fixes that.

EmailCache (in-memory, per-session)

A dict keyed by Message-ID. Each entry stores:

• from, to, cc, bcc, reply_to — address fields

• date — parsed datetime (for sorting)

• subject

• message_id — the full RFC 2822 Message-ID

• in_reply_to, references — threading headers

• folder — current IMAP folder

• imap_id — IMAP numeric UID (for fast ops)

• body — None if headers-only, string if full body fetched

Cache is rebuilt from IMAP reads each session. On failed full-body fetch: remove entry. On move/delete: update folder or remove. No persistence — IMAP is authoritative. Staleness resolves itself on next startup.

Pool Building — order of operations

1. Fetch trigger email (full body) → primary, will be #1

2. Parse trigger's References header for full ancestor chain. Fall back to walking In-Reply-To if References absent/incomplete. Cap at ~15 ancestors.

3. Load bundle note (if one exists for this thread).

4. Scan bundle note text for Message-IDs (pattern: <...@...>). Add those to the candidate set.

5. For each candidate Message-ID (excluding trigger):

a. Check cache first

b. If not cached: HEADER search across all folders (INBOX, Done, Sent, others in order)

c. If found: fetch headers-only, add to cache

d. If not found: record as #NotAvailable

6. Upgrade In-Reply-To (direct parent) to full body — fetch body, update cache entry.

7. Sort entire pool (trigger + all found ancestors) by date, newest first.

8. Assign Quick-IDs: #1 = trigger (newest), #2, #3 ... etc.

9. Post-process bundle note text: replace raw Message-IDs (<...@...>) with their assigned Quick-IDs (#N). Unfindable ones become #NotAvailable.

Gathered emails (fetched during the iteration loop) extend the numbering: next available #N.

What the LLM sees

〶 Email #1
From: ...  To: ...  Date: ...  Subject: ...

[full body]

〶 Thread context
#2  From: ...  To: ...  Date: ...  Subject: ...
    [full body — In-Reply-To / direct parent]

#3  From: ...  To: ...  Date: ...  Subject: ...  [headers only]
#4  [not available]

The trigger is always #1. Ancestors follow in newest-first order. Section headers use 〶, not ===.

add_emails with Quick-IDs

The LLM may set message_id to a Quick-ID (e.g. #3) in add_emails requests. The gather phase resolves it:

• If #3 is in the pool but headers-only: fetch full body, upgrade that cache entry in-place. Same Quick-ID, no duplicate.

• If #3 is not in pool: fetch as new email, append to gathered pool, assign next Quick-ID.

The existing resolve_email_ref already handles #N — minimal extension needed.

Thread Summary (summarising phase)

The LLM is encouraged (via the summarising phase note) to write a thread summary into the bundle note. Storage format uses raw Message-IDs (stable across sessions), dates, and From/To — NOT Quick-IDs (which are ephemeral).

Example stored summary: 'On 2026-03-01, <msg1@host> (john@example.com) asked about X. On 2026-03-02, <msg2@host> (bob@example.com) replied with Y.'

On each new cycle, stored Message-IDs are pulled into the pool (step 4 above), assigned Quick-IDs, and the summary text is presented with Message-IDs replaced by Quick-IDs.

The summarising phase instructions must state: write Message-IDs verbatim in angle-bracket form so they can be parsed and re-linked next cycle.

Section headers — 〶

Replace all === SECTION === markers in call_llm with 〶 Section name on its own line. Examples:

• 〶 Email #1

• 〶 Thread context

• 〶 Gathered notes

• 〶 Phase: triage

• 〶 Results from previous iteration

Code changes required

1. EmailCache class — new class in orchestrator.py (or email_cache.py)

2. fetch_thread_chain(client, trigger, cache, folder_list, bundle_note) — builds pool, returns sorted list

3. process_email_message() — build pool before iteration loop; pass to call_llm and execute_actions

4. call_llm() — updated presentation; 〶 headers; thread context section

5. execute_actions() — already handles #N Quick-IDs; no change if pool is passed correctly

6. Gather phase handler — upgrade-in-place logic for Quick-ID add_emails requests

7. Move/delete operations — update cache folder / remove entry

8. summarising phase note — update to require Message-ID format and encourage thread summary

Not yet done

All of the above is designed but not implemented. Implement in a branch. The orchestrator.py changes are significant — read the full file before editing.

version2
created2026-03-07