Locally correct, globally incoherent: the real problem with AI-generated code

There’s a reading of AI development that’s become consensus, and it’s fatalistic: AI doesn’t understand architecture, it doesn’t “feel” software, it doesn’t make long-term decisions. And it stops there, as if it were an intrinsic limit of the model, something time and parameters will eventually fix. Or won’t.

That reading mistakes the symptom for the cause.

The model optimizes the next step because the next step is all we hand it

A model optimizes the next step because the next step is all we give it. Architecture is the exact opposite: deciding today against constraints that don’t exist yet, against the cost of maintaining something for years, against the conceptual coherence of a system that isn’t fully written. It’s reasoning about the future. And the future almost never travels inside the context we give the machine.

Every session starts blank. The decisions you made yesterday, why this layer exists, what invariant this module protects, what was tried and dropped, live in your head, not in the material the model sees when it touches the code again. We ask for a local step, without the blueprint of the building, and we’re surprised the result is locally correct.

It’s literally what we asked for.

The incoherence isn’t inside the model

Global incoherence doesn’t emerge because the model is dumb. It emerges because nobody is managing what the model sees each time it decides. The problem isn’t inside the model. It’s the missing layer between our intentions and its context.

Without the layer, an intention becomes local steps and locally correct but globally incoherent code. With the layer, the architecture, the spec, and the definition of correct travel with every change and get verified.

Coherence was never a property of intelligence. Think about it. You know brilliant, incoherent people: the genius who finishes nothing, the brilliant founder who pivots every Monday. Locally correct, globally incoherent.

It’s a property of process. You don’t get a coherent system by waiting for the model to feel the architecture. You get it by making the architectural decision, the specification, and the definition of what “correct” means travel with every change, and verifying each step against it. Coherence is imposed, not invoked.

The industry’s incentive points at more, not better

Worth naming the bias. Cloud AI monetizes volume, not elegance. Nobody bills for deleting 20,000 unnecessary lines, that’s still an almost artisanal craft, but there’s a whole industry billing for generating them. The entire incentive points at more generation, more iteration, more tokens, more consumption. Vibe coding isn’t popular because it works better, it’s popular because it’s perfectly aligned with who gets paid.

Worth remembering when someone sells you that the machine already handles the architecture.

Junior vs. senior is a tool multiplier, not a person multiplier

A junior produces more noise when the constraints don’t live in the system, when nothing forces them to respect the decisions already made. A senior produces more value because they carry those constraints in their head and apply them without thinking. But that difference is exactly what can be partly externalized: when you make the architecture a first-class artifact, and I promise you can, suddenly the junior, or the agent, also produces coherence.

Not by talent. By scaffolding.

There is a part, yes, that doesn’t externalize: software is felt. Anyone who has built engines or gameplay for years knows there are decisions that don’t come from the verbal logic of a prompt, but from intuition accumulated over thousands of hours watching systems fail. Intuition is compressed experience. Part of that compression can become explicit rules, constraints, criteria. Part can’t, and that’s why the human stays at the center of the loop.

Denying it is the snake-oil seller’s error. But surrendering to it, saying that because you can’t capture everything you can’t capture anything, is the opposite error, and just as wrong.

What actually worries me

What worries me isn’t that AI writes code. It’s that fewer and fewer people learn to understand complex systems. When these projects (functions that exist and nobody knows why, abstractions nobody dares to touch) need serious maintenance, you’ll need people who can read code, detect coherence, simplify, and decide. And those profiles are being trained less and less, because everyone is learning to write prompts instead of to understand systems.

The answer isn’t banning AI. It isn’t praying for someone who can read the 100,000 lines either. It’s making the system legible by design: every decision leaves evidence, every criterion is verifiable, understanding stops depending on a rare expert and becomes forced by the very process of building.

The missing layer

I’ve spent more than a year building exactly that layer: PaellaDoc. Forcing the blueprint to travel with every change, and letting nothing reach “done” without proving it against the criteria set before a line was written. That’s the move from spec-driven to spec-gated: the specification isn’t a document that rots in the repo, it’s the gate that decides.

It’s not a claim of faith. When you measure it, a green build is not a correct feature: even the best model, raw, ships a real bug one in three times on anything non-trivial, non-deterministically. The spec up front and the execution gate close it. And one level up, making product is installing a behavior, not accumulating surface: the same idea, that coherence is imposed, applied to what to build instead of how to build it.

The human stays at the center. What changes is that they stop holding coherence together with memory and start imposing it with process.

Frequently asked questions

What does “locally correct, globally incoherent” mean?

It describes code where each individual change is right on its own terms but the system as a whole stops making sense. Every function does what it was asked to do, yet the pieces contradict each other’s assumptions, duplicate concepts, and drift from the architecture. The model optimized the next step, which is exactly what it was handed, so it produced a locally correct step with no view of the whole it belongs to.

Why does AI write code that doesn’t fit the rest of the system?

Because the rest of the system almost never travels inside its context. Every session starts blank. The decisions you made yesterday, why a layer exists, what invariant a module protects, live in your head, not in the material the model sees when it touches the code again. You ask for a local step without the blueprint of the building, and you get a locally correct step. It is literally what you asked for.

Can a bigger or smarter model fix incoherent AI code?

No, because the incoherence is not inside the model. Coherence was never a property of intelligence, brilliant, incoherent people exist. It is a property of process. You get a coherent system by making the architecture, the specification, and the definition of correct travel with every change and verifying each step against them. More parameters do not supply the missing layer between your intentions and the model’s context.

How do you get coherent code out of AI agents?

Coherence is imposed, not invoked. Make the architectural decision, the spec, and the definition of “correct” first-class artifacts that travel with each change, and gate every change against them so nothing reaches “done” without proving it. This is the move from spec-driven to spec-gated: the specification is not a document rotting in the repo, it is the gate that decides. Scaffolding, not talent, is what makes even a junior or an agent produce coherence.