every line, accountable
Git stops at "who committed it". PaellaDoc keeps the chain: spec → agent conversation → code → e2e proof. Open any line months later, refactor without fear, onboard without ceremony. The thread doesn't break.
Five branches. A Claude conversation about architecture from three days ago. Prompts in three providers. Decisions in Slack. Specs in Notion. Sensitive context that doesn't want to live on anyone's cloud. PaellaDoc orchestrates everything that happens before — and after — the code itself. From the AI conversation that produced the line to the Playwright e2e video that proved it works. On your machine.
Categories don't die when products disappear. They die when they answer the wrong question.
The IDE optimizes one metric: friction between intent and code in the repo. Cursor is the culmination of that story, not its rupture. If your question is "how do I write code faster?" — Cursor is the best answer.
That stopped being the question.
You don't open your machine to write code. You open it to a Claude conversation from three days ago about architecture. A ChatGPT session with prompts now worth gold and lost. Five active branches. Notion docs. Linear tasks. Slack decisions that change today's code. Three LLM providers with different policies. Context too sensitive for anyone's cloud.
The bottleneck moved. It's not typing. It's coherence — between intent, context, decisions and models.
Three things IDE + AI structurally cannot solve
These aren't features missing. They're architecturally external to the IDE concept. No IDE will solve them — no matter how much AI gets bolted on.
PaellaDoc is
what comes after.
Git stops at "who committed it". PaellaDoc keeps the chain: spec → agent conversation → code → e2e proof. Open any line months later, refactor without fear, onboard without ceremony. The thread doesn't break.
CI runs the tests the agent wrote. PaellaDoc runs Playwright e2e against your live product in Docker, with the agent's code applied, and writes screenshots + video + trace to disk. No evidence, no green light. The agent doesn't decide. The gate does.
Everything on your disk, in plain SQLite. No telemetry, no cloud, no vendor cage. Bring any model — Claude today, Codex tomorrow, a local Llama next week. The brain changes; your context doesn't.
A year goes by. You open a story you barely remember. You see exactly who wrote which line, why, what spec asked for it, what test proved it works, and what depends on it. No archaeology. No "who shipped this?" No abandoned PRs from agents nobody can debrief.
# your fleetYOU (1 human) ├── orchestrator · paelladoc-ado · local │ ├── project: acme/portfolio-v3 │ │ ├── worktree-01 → claude · story-014 · ✓ │ │ ├── worktree-02 → codex · story-015 · ↻ │ │ └── …16 more │ ├── project: paelladoc/ado-core │ │ ├── worktree-01 → codex · refactor · ✓ │ │ └── …23 more │ └── project: cabin/local-llm-bench │ └── worktree-01 → llama-local · ✓ (offline) └── context_store · sqlite · ~/.paelladoc/ ├── chats.db 2.4 GB ├── decisions.db 180 MB └── artifacts/ 11 GB
Every agent runs in its own git worktree. Its own branch, its own filesystem view, its own conversation. Cleanup is automatic. Merge wars stop being a thing.
You are not the bottleneck. Queue 50 user stories before bed; wake up to 50 worktrees finished, replayed, and ready to review at your speed.
The orchestrator is on your machine. The context is on your machine. Each AI developer calls out to whichever brain you choose — or to a local model you self-host.
The real value of working with AI for months isn't the code it generates. It's everything you build along the way: your decisions, your edge cases, the way your domain actually works.
Most agentic tools quietly turn that into someone else's training data. Or they lock it inside a single frontier model — switch and you lose everything.
PaellaDoc ADO inverts it. Your context lives in plain SQLite, on your disk, in formats you can read. Point any model at it — Claude today, Codex tomorrow, a local Llama from a cabin next week. The brain changes; your knowledge doesn't.
Most agent stacks let the agent declare its own work done. The PR lands. Three weeks later, your test suite catches the lie — if you're lucky. PaellaDoc inverts it: every acceptance criterion runs as a real Playwright e2e against your actual product, in Docker, with the agent's code applied. Screenshots, video, trace — saved to disk before status flips to done. No evidence, no green light.
done when evidence exists and passesAir-gapped clients. On-prem mandates. Regulated industries that forbid frontier APIs. Or just a cabin off-grid. PaellaDoc ADO keeps the product methodology intact — PRD, epics, stories, acceptance criteria, golden-gate validation — and routes each story to whatever brain your context allows. Same hierarchy. Same evidence trail. Same shipping bar.
Day-0 frontier MoE, running locally. The orchestrator drives it like any other backend — same PRD, same golden gate, same audit trail.
Most teams default to frontier-everything because picking the right model is friction. PaellaDoc inverts it: every acceptance criterion declares its bar — and a per-AC router auto-picks the cheapest model that still passes the golden gate. You stop being the bottleneck. You stop being the wallet.
~/.paelladoc/chats.db · $ paelladoc cost --report
PaellaDoc is open-core. The desktop orchestrator, context graph, memory ranking, router and execution engine are the product core.
The extension surface is open: plugin SDK, manifest schema, CLI adapters, MCP packs, validators and examples.
Build adapters for your favorite agents. Keep your context local. Don't wait for us to support your stack.
Wire any agent CLI — Claude Code, Codex, Gemini, Cursor, your own — into the orchestrator. The router calls them. The fleet runs them.
Bundle Model Context Protocol servers as plugins. Vector DBs, custom tools, internal APIs — exposed to the fleet, scoped per project.
Custom golden-gate checks. Type purity, license audit, perf budget — your bar, not ours.
Opinionated bootstraps. Next.js + Postgres, Rails + Sidekiq, your house template — not ours.
Reusable skill recipes. PR review, migration scaffolding, test-pyramid generation — packaged, versioned, shareable.
Boundary by design. Plugins cannot access raw chats, KG internals, embeddings, router scoring or secrets — unless PAELLADOC grants an explicit local permission.
Six side-products. One human. The orchestrator runs each as its own fleet, and the context never bleeds between them.
100 stories across a legacy migration. Queue them, assign brains, watch the tree turn green. The PM sees what every agent decided, not just what it shipped.
Owns the model weights. Owns the context. Codes from a cabin. PaellaDoc just routes the work.
Client code never leaves the laptop. Frontier APIs are opt-in per project. Audit trail in plain SQLite.
Same task, three brains, side-by-side diff. Keep the best output. Stop guessing which model is good at what.
“I'm one person. I run several products. I don't want to be a manager of AI developers — I want to be the architect, and let a fleet run.
I also don't want my entire mental model — every conversation, every decision — to live inside a vendor's cloud where it disappears the day I switch tools.
PaellaDoc was for the new paradigm — context and spec-driven over the code itself. The architect writes the spec; the system ships the code. 5× productivity — thesis. 277★ in April 2025, when nobody was talking about this yet.
Now I'm back. PaellaDoc ADO — chasing 100× for one human.”
free for personal use, forever · 100% local · no account · macOS apple silicon
team or enterprise? — DM @jlcases on X