Recently I completed a project that felt, at times, like performing open-heart surgery on a living organism — with an AI as my surgical partner.
The organism was pwsafe, the long-established open-source password manager originally designed by Bruce Schneier. It’s mature, stable, and deeply trusted. And like many long-lived C++ codebases, it’s careful, layered, and not something you casually ‘just modify’.
The goal was simple to describe and surprisingly complex to achieve: make it possible to open a remote password database over WebDAV just by typing a URL — no fragile OS-level filesystem mounts required.
Historically, if you wanted your password database stored remotely on a WebDAV server, you had to mount it at the operating system level. On Linux, for example, that might mean mounting something like /z → https://webdav.example.com/ at login.
If the mount wasn’t active, pwsafe simply failed. Locking was unreliable. The setup wasn’t portable across platforms. And from a user experience perspective, it felt like a workaround rather than a feature.
What I wanted instead was first-class support: File → Open URL…, type https://…, done.
The breakthrough came from noticing something deceptively simple: the entire codebase ultimately opens files through just two functions — pws_os::FOpen and FClose.
If I could intercept those two calls and detect when the ‘filename’ was actually a URL, I could transparently route the operation through a transport layer instead of the local filesystem.
The result was a small, versioned plugin ABI and a dynamically loaded .so transport plugin. If the user provides a URL, the plugin handles fetching, storing, existence checks, and — critically — locking. If it’s a normal path, nothing changes.
The core and UI layers of pwsafe remain untouched. From their perspective, they are still just opening files.
WebDAV supports server-side locking. That sounds straightforward until you remember that unlocking may need to happen during application shutdown, signal handling, or even a crash.
libcurl (used for the WebDAV implementation) is not async-signal-safe. But releasing a lock during shutdown must be reliable.
The solution was to fork a dedicated child process — a lock daemon — the first time a lock is acquired. That child process owns the lock token and communicates with the parent over a Unix socket. The parent only ever performs async-signal-safe writes to that socket. If the parent crashes or is killed, the kernel closes the socket, the child detects EOF, and releases all locks.
It was the most subtle and fragile part of the design. And it’s where AI collaboration became genuinely interesting.
Every line of new C++ in this project was written collaboratively with Claude Code, Anthropic’s AI coding assistant.
This wasn’t autocomplete. It was architectural dialogue. We designed the plugin ABI together. We iterated on the lock daemon protocol. We reasoned through fork semantics, file descriptor inheritance, TOCTOU races, and RFC compliance.
There were frustrations. Sometimes the model would confidently propose something subtly wrong — especially around Unix process behaviour. You still have to think. In fact, you have to think harder, because the AI can sound convincing.
But the speed of iteration was extraordinary. I could explore design alternatives conversationally. Refactorings that might have taken an afternoon happened in minutes. It felt less like typing code and more like shaping a design in real time.
After the implementation was complete, I switched roles: AI as builder became AI as adversary.
Using my own Python tool, ‘ask’, I bundled the relevant source files and sent them to OpenAI models (o3 and gpt-5.2) with a security audit prompt. I asked them to be ruthless.
Between them, they found 35 issues before deduplication — including four Critical vulnerabilities.
One was a newline injection vulnerability in the lock daemon’s text-based IPC protocol. Another was a classic TOCTOU race in the plugin loader. A third allowed cross-protocol redirects in libcurl that could have written to local files. None were theoretical. All were real.
They would have been embarrassing — and potentially dangerous — to ship.
The audits were impressive. The models systematically scanned code that a human reviewer might skim. They explained attack vectors clearly and suggested concrete fixes.
But they weren’t perfect. Both models independently missed a subtle file descriptor inheritance issue: without SOCK_CLOEXEC, a spawned GUI child process could accidentally keep the daemon’s socket alive, preventing lock release. I found that in a manual review.
The lesson was clear: AI is a powerful first-pass reviewer, not a final authority.
To reduce regression risk, I built four separate test suites: unit tests for URL parsing and scheme extraction, standalone tests for the plugin loader, live WebDAV tests against both a remote server and a local wsgidav instance, and lock lifecycle tests that simulate crashes.
The goal was containment: if something broke, I wanted to know exactly which layer was responsible.
No. But it changed the nature of the work.
Claude Code accelerated architecture and implementation. The OpenAI models provided an inexpensive, high-signal security review. But judgement — especially around threat models, Unix semantics, and deployment realities — still required a human.
What surprised me most wasn’t that AI could write C++. It was that it could meaningfully collaborate on design decisions inside a mature, layered, security-sensitive codebase.
Today, you can open a remote pwsafe database simply by typing a URL. No OS mount. No fragile sidecar lock files. Proper WebDAV locking held safely across crashes.
Under the hood is a transport plugin system that can support future protocols, a hardened lock daemon, and code that has survived three independent audits.
The most interesting part of this project wasn’t WebDAV. It was the workflow:
AI as design partner → AI as code generator → AI as security auditor → Human as final integrator and sceptic.
That combination feels powerful.
We’re entering a phase where experienced engineers can move faster — not because the machine replaces them, but because it expands the surface area of what they can safely attempt.
This project would have been possible without AI. But it would have taken longer, and I’m less certain it would have been as thoroughly audited.
That feels like real leverage.