← WritingTechnical

How AI agents work: a control flow breakdown

The standard AI agent explanation gives you a bag of parts:

  • model
  • tools
  • memory
  • reasoning
  • human-in-the-loop

That is not useless. But it hides the most important thing: the control flow.

If I tell you a car has an engine, wheels, brakes, and a steering wheel, I have named the parts. I still have not explained how the car moves.

Most agent writing has the same problem. It names parts. It does not show the engine.

The clearest way to understand an AI agent is not as a bag of parts. It is as a loop inside a harness. Call it the two-loop model.

Start with the inner loop

The inner loop is where the work happens. Here is the inner loop diagram:

A simple way to read that diagram is:

  1. build context
  2. call the model
  3. if the model asks for a tool, run it
  4. add the tool result back into context
  5. call the model again
  6. stop when there is a final answer

That is the engine.

This is the move that turns a one-shot prompt into a working system. A one-shot prompt can only answer from what is already in context. A loop can do more than that. It can look at the task, decide it needs something, use a tool, see the result, update context, and continue.

What the inner-loop nodes actually do

The smaller nodes in the diagram matter too.

Build Context is not one mysterious step. It usually means gathering the things the model can see for this turn. In the diagram, those inputs are:

  • conversation history
  • retrieved info or memory
  • available tools

Those then feed into Assemble Prompt. That is the step where the system turns those inputs into the actual context window the model sees.

Then the model is called. In the inner loop diagram, that node is labeled Call LLM Token Generator. That phrasing is trying to be mechanically honest. From the system’s point of view, the model takes context in and generates the next tokens out.

After that comes the decision point: Need Tool Call? If the answer is no, the loop ends and returns a response. If the answer is yes, the system executes the tool, captures the result, feeds that result back into context, and loops again.

That feedback path is the important part. It is what lets the system do work instead of only answering from a static prompt.

A useful shorthand is:

  • the inner loop is the core AI agent
  • the outer loop — the harness — is what makes it autonomous

What actually lives inside the loop

People often draw tools, memory, and reasoning as if they are separate floating blocks around the agent. They are better understood as roles inside the loop.

Context

Context is what the model can see right now. That includes things like:

  • chat history
  • retrieved information
  • tool definitions
  • previous tool results
  • system instructions

If something is not in context, the model cannot use it.

That simple rule clears up a lot.

Tools

Tools are how the loop reaches outside itself. The model asks for a tool. The system runs it. The result goes back into context. Then the loop continues.

Anthropic makes this concrete too. In the same post, it describes the basic building block of agentic systems as:

“an LLM enhanced with augmentations such as retrieval, tools, and memory.”

It tells you that tools and memory are not decorative extras. They are part of the working system.

Memory

Memory is saved information the loop can use later. In many systems, memory is not a special runtime primitive. It is just data the loop can read and write through tools.

That means memory is important. But it is not mystical. It is state.

Reasoning

Reasoning is not a separate floating box either. It shows up in how the model uses the current context, decides what to do next, and keeps going until it can answer. Concretely: it is the model deciding whether to call a tool, ask a question, or return an answer.

So the clean mental model is this:

tools, memory, and reasoning live inside the loop.

A quick note on tokens

The model itself is worth pinning down precisely.

Before text goes into the model, a tokenizer splits it into tokens and maps those tokens to integer IDs. The model works on those IDs and predicts the next ones.

This matters for one reason in this article: better models do not change the loop. They just produce better tokens inside the same loop.

That is a useful thing to keep in mind because a lot of agent discourse mixes up two separate questions:

  • how good is the model?
  • how is the system structured?

Those are related, but they are not the same question.

Then add the outer loop

Once you see the inner loop, the second layer is obvious: everything around the loop.

Here is the outer loop diagram:

The key move in this diagram is that the inner loop is one node. That is the whole point. The harness is everything around it.

What the outer-loop nodes actually do

The outer loop starts with a Trigger. The system does not wake itself up out of nowhere. Something starts it:

  • a user message
  • a schedule
  • an event
  • a webhook

Then comes Permissions & rate limits. That is a reminder that the system may have rules outside the inner loop. Some runs are allowed. Some are blocked. If a run is blocked, the system may Reject / Notify instead of entering the loop at all.

If the run is allowed, the system enters the Inner Loop.

From there, the harness can still get involved in a few ways.

If a tool call needs approval, the system can go to Ask Human permission before proceeding. If the model needs more information, that is a different path: Ask Human clarification.

When the loop reaches a final answer, the work is still not necessarily over. The system may Monitor + Log the result. It may ask whether Human review is required. If review is required, the output may be approved, rejected, retried, or escalated. If review is not required, the system can go straight to Deliver Output.

After delivery, there may still be one more decision: should the system Schedule next run? If yes, the outer loop can re-enter later. If not, the run is done.

That is what the harness is for. It decides:

  • when the inner loop runs
  • what it is allowed to do
  • whether a human needs to approve something
  • what happens with the result
  • whether the whole thing runs again later

So if the inner loop is the engine, the harness is the system around it that makes it usable in the real world.

You can see the same shape in how leading labs describe real systems.

OpenAI’s agent docs talk about building agent workflows by connecting pieces like:

  • tools
  • memory / knowledge
  • control-flow logic
  • orchestration
  • monitoring

Same shape.

Human-in-the-loop shows up in two different ways

People say “human in the loop” like it is one thing. In practice, it usually shows up in at least two different ways.

1. Permission gates

The harness pauses a tool call and asks for approval. For example:

  • sending a message
  • making a purchase
  • editing production data
  • running a risky command

That is not the model asking for missing knowledge. That is the surrounding system enforcing a rule.

2. Clarification requests

The model itself may decide it needs more information before it can continue. So it asks the human a question. The answer goes back into context, and the loop keeps going.

From the outside, both cases can look similar. A human got involved. But they are not the same mechanism.

One comes from the harness. The other comes from the loop.

That distinction matters.

Why the workflow vs. agent debate never dies

A lot of public confusion about agents is really confusion about layers.

Anthropic makes an explicit distinction between workflows and agents:

“Workflows are systems where LLMs and tools are orchestrated through predefined code paths.”

“Agents, on the other hand, are systems where LLMs dynamically direct their own processes and tool usage, maintaining control over how they accomplish tasks.”

Whether you agree with every edge case in that definition is less important than the bigger point: the distinction exists because people are collapsing different kinds of systems into one vague word.

That is why the debate never ends. People say “agent” and mean different things:

  • a tool-using loop driven by a model
  • a workflow with some LLM calls in it
  • a production system with retries, approvals, and monitoring
  • a multi-agent orchestration layer

Those are related ideas. But they are not the same layer.

If you do not separate the inner loop from the outer loop, the whole topic starts to blur together. Then you get the endless arguments:

  • is this an agent or just a workflow?
  • is memory part of the agent or part of the system?
  • are approvals part of the agent?
  • what makes something agentic at all?

The loop-inside-a-harness framing does not solve every naming debate. But it does make the system much easier to reason about.

Why this model helps

This framing is better for a few simple reasons.

More mechanically true

It shows the control flow. It shows what repeats. It shows what belongs inside the core runtime and what belongs outside it.

Easier to teach

People can remember:

  • the inner loop does the work
  • the outer loop manages the world around it

That is a much better teaching handle than a loose bag of parts.

Easier to debug

If something fails, you can ask:

  • did the model fail inside the loop?
  • was the context bad?
  • was the tool bad?
  • did the harness block the step?
  • did orchestration or delivery fail outside the loop?

That is a much more useful debugging lens.

Easier to build from

Once you separate the loop from the harness, design choices get cleaner. You can decide:

  • does this use case need a workflow or an agent?
  • does this need memory?
  • where should approval gates live?
  • where should retries live?
  • what should be handled by the core runtime vs the production wrapper?

Less fake magic

Most agent writing sounds mystical because it jumps straight from model to autonomy without spending enough time on system structure. This framing brings the topic back down to earth.

The goal is not to make agents sound impressive. The goal is to make them understandable and buildable.

What this repo is

The small repo behind this article focuses on the inner loop. That is intentional.

It is not pretending to be a full production framework. It is there to make the core idea easy to inspect.

That matters because writing can drift. Code is harder to bluff. A small executable reference can keep the explanation honest.

So the repo is the truth anchor. The article is the explanation layer built on top of it.

Closing

Stop starting with a bag of parts. Start with the loop. Then draw the harness around it.

That gives you a much cleaner picture:

  • the inner loop does the work
  • the outer loop manages the world around it

Tools, memory, and reasoning belong inside the loop. Triggers, permissions, approvals, retries, monitoring, review, delivery, and scheduling belong outside it.

Once you see that split, AI agents stop looking muddy and start looking buildable.

That is the two-loop model.


References