Deterministic Scaffolding for an Agentic World
Pure deterministic software is brittle. Pure agentic is unreliable. The fix isn't a smarter model — it's better scaffolding. Here's what that looks like in practice.
The first agent I deployed in production made a brilliant decision and then immediately forgot it made it.
That’s the fundamental problem with pure agentic systems. A large language model — an AI trained to generate responses from patterns in data — is extraordinarily capable in a single turn. String a few turns together, and you’re fighting entropy. Context fades, decisions evaporate, state drift sets in. It’s not a model quality problem. It’s an architecture problem.
Here’s the thing. The answer isn’t to wait for a smarter model. The answer is better scaffolding around the model you have.
What Scaffolding Actually Looks Like
I run a multi-agent workspace — MAW — where each project gets its own agent. Each agent has a stable identity that persists across restarts, renames, and hardware changes. Agents address each other using that identifier, not by knowing where the other agent is running at any given moment.
That stability matters. A naming convention seems like a small thing. But when your system depends on agents coordinating across dozens of sessions, small things become load-bearing infrastructure.
Every agent wakes up with its conventions already in its context — not as something it learned and might misremember, but as something the system delivers fresh every time. That’s a deliberately deterministic choice inside an otherwise agentic architecture. The agent doesn’t remember the protocol. The protocol is given to the agent.
State lives in a canonical store. Tasks don’t get deleted — they transition. Messages don’t disappear — they archive. Every change writes a record. When something goes wrong (and it will), you have a complete audit trail. Debugging a multi-agent system without that trail is guesswork.
Nightly review cycles close the loop. Findings from the previous day become tasks. Tasks in progress get flagged. The system checks its own work on a schedule, not on faith.
The Same Shape, Every Industry
I consult across a range of sectors — insurance, retail, professional services, operations. The pattern is always the same.
The teams winning with AI didn’t find a better model. They built boring infrastructure well. They picked a system of record and made every agent write to it. They defined what “done” means for a task before the agent starts. They kept the deterministic parts — routing rules, data validation, output format — deterministic, and let agents handle the parts where flexible reasoning actually matters.
The teams struggling did the opposite. They pointed a capable model at a problem and hoped. When outputs drifted, they blamed the model. When coordination broke down, they added more prompting. When the whole thing fell apart, they concluded AI wasn’t ready for their use case.
It was ready. The scaffolding wasn’t.
In insurance, claims routing is mostly rule-bound — coverage is covered or it isn’t. But drafting an adjuster’s summary letter? That’s where the agent earns its keep. The mistake is letting the agent touch both without a clear boundary between them.
In retail, inventory decisions have hard rules: reorder points, supplier lead times, contractual minimums. An agentic system that ignores those constraints will eventually violate them — not because it’s reckless, but because rules that aren’t encoded in the scaffolding eventually get forgotten. State drift is patient.
Where to Start
If you’re standing up an AI system and wondering where to begin, here’s the order:
Protocols before models. Decide how agents will communicate, how state will be stored, and how errors will surface — before you write a single prompt. These decisions are the hardest to retrofit.
Pick a system of record. Every piece of state that needs to survive a session restart goes there. Not in a variable, not in conversation history. In the store.
Close the feedback loop. Build review cycles in from the start. Agents need external correction. The question is whether that correction is structured or accidental.
Keep the deterministic parts deterministic. If a rule can be expressed as code, express it as code. Don’t ask an agent to reliably apply a rule it has to infer. That’s a recipe for drift.
The goal isn’t to constrain the agent. It’s to make the agent’s flexibility an asset rather than a liability.
Smart models in bad scaffolding will consistently underperform. Capable models in good scaffolding will surprise you.