PP Rewrite — Claude vs Gemini Today

Parent: pp-rewrite

Context Window

Gemini wins nominally — 1M–2M tokens vs Claude's 200K (1M in beta on some tiers). For a Perl codebase of any size, this matters: whole codebase plus framework docs plus target style guide can fit in one Gemini context.

Caveat rarely surfaced in procurement discussions: context utilisation degrades with length for every current model. Both Claude and Gemini show measurable quality drops past ~200K tokens on tasks requiring synthesis across the whole context (not retrieval). Gemini's nominal advantage is partly notional for translation tasks needing genuine reasoning over everything in context.

Code Quality

Honest assessment: Claude (Opus 4.7, Sonnet 4.6) tends to produce more idiomatic, more maintainable code with better instinct for "what would a competent Python developer write". Gemini 2.5 Pro is strong and has closed much of the gap, particularly on large-scale refactoring where context is genuinely useful. Claude has the edge for the Python generation phase today, but it's not a chasm.

For the spec-writing phase, both are competent. Iteration speed and how easily the model accepts corrections matter more than raw capability.

Agentic Tooling

Claude Code is more mature than Gemini's equivalents (Jules, Gemini CLI). For an iterative test-driven porting workflow, this matters more than raw model capability — the difference between "AI suggests code" and "AI runs tests, sees failures, fixes them, commits" compounds over a long project.

Realistic Success Rates Per Phase

Spec extraction: 80–90% useful output. Plan to iterate on Mojolicious-specific idioms, especially helpers, plugins, the rendering pipeline.

Spec → Python: 70–80% useful output if the spec is good. Remaining 20–30% is integration plumbing, async correctness, dependency choices, and "looked right but wasn't" bugs.

End-to-end without humans in the loop: still a bad idea. Not because AI cannot write the code, but because nobody can verify a 50KLOC port without reading it.

Practical Recommendation Given Gemini Mandate

Don't fight the procurement constraint. Use Claude (or whichever model iterates fastest) to draft the semantic spec; refine against the Perl source. Generate Python with Gemini per mandate. The spec is human-readable and model-agnostic — it's the durable artefact and the governance document.

If you re-do the generation phase in 12 months with a better model, the spec still applies.

Process quality (good specs, characterisation tests, strangler-fig rollout) is roughly 80% of the outcome; model choice is roughly 20%.

version 1  ·  created 2026-05-07  ·  updated 2026-05-07