OODA makes an agent faster, not smarter
Boyd's loop was built for fighter pilots. Observe, Orient, Decide, Act, repeat: whoever cycles faster controls the engagement. It is the right model for a human, because a human learns between cycles without thinking about it.
An agent does not. Run OODA on an agent and it makes faster decisions, not better ones. It repeats the same mistake at machine speed, because nothing in the loop carries the lesson forward to the next run.
Add the second A: Adjust
After acting, score what happened and update how you decide next time. That one phase is the whole idea. OODA gets faster; OODAA gets smarter. The learning is a single line that folds each outcome back into memory, so the next time a situation shows up, memory answers instead of guessing.
In the offline demo, a 4x4 grid agent that starts knowing nothing wanders for 69 steps on episode one. By the end it walks the shortest path of 6. Nothing in the loop is clever. The only thing that changed is what Adjust wrote down.
One screen, five blocks
The loop is domain-neutral. A Task supplies the world, the loop supplies the cycle. The same machinery drives a toy grid and a stream of operational incidents.
Small enough to read, real enough to build on
The core is the Python standard library. No Redis, no database, no service. Clone it and run it. The whole loop fits on one screen.
The loop never imports a provider. Wrap any LLM in one completion function and drop it in. A failed or illegal model call degrades to random exploration rather than taking the loop down.
A running value per situation-action pair, plus hypotheses you can attach. Print it and read what the loop believes. No black box.
Everything else is scaffolding around the one phase that learns. Strip it back and you can see exactly where an agent gets smarter across runs.
From a toy grid to operational triage
A 4x4 grid and an agent that starts knowing nothing. No API key. Watch the step count fall from 69 to the shortest path as Adjust keeps score.
The same loop over a stream of incidents: a shipment past its SLA in the TMS, an invoice blocked on a price variance in the ERP, a count gone negative in the WMS. First responses go from 35% right to 100% as memory carries what worked.
The production-grade version of the same idea, with utility scoring, replay, background self-review and gated self-modification, where OODAA is the only loop the whole system runs.
This project is the companion to the essay Boyd's Loop Was Built for Pilots, Agentic AI Needs OODAA, made runnable.