The Agentic Dev Loop — Designing Modern Development Systems for Human + Agent Teams

The execution environment is where code actually runs. It's where dependencies get installed, where tests execute, where the dev server spins up. When agents enter the picture, the question of where the execution environment becomes a new and important one.

Where do humans run code?

For most developers, the best execution environment is still local. The laptop delivers the tightest possible feedback loop: save a file, hot reload fires, you see the change. No network hop. No cold start. The inner loop is measured in milliseconds.

Remote development environments emerged to solve different problems: onboarding complexity, dependency drift, and machine variability. Instead of every engineer curating a fragile local setup, teams run code in shared or managed environments provisioned in the cloud. Tools like Coder, GitHub Codespaces, and Telepresence made this viable by narrowing the latency gap and improving reproducibility.

Remote dev was optimized for human collaboration and environment parity, not high-frequency context switching. When a single developer spends most of the day in one workspace, a small amount of startup latency is tolerable.

Agents change the constraint. In an agentic workflow, developers aren’t just working in one environment; they’re supervising many. The cost of switching environments stops being an occasional annoyance and becomes a core productivity variable.

The context-switching problem

In an agentic workflow, you're not just running your own code. You're checking in on agents. You might have a dozen agents working on different tasks, each in its own environment. Every developer should be able to:

Switch to an agent's environment to see what it's doing
Pull its branch, spin up the dev server, verify its work
Make a quick fix, push, switch back to your own work
Repeat across multiple agents
Inspect running services, logs, and state (without restarting everything)

The old inner loop was save → run. The new inner loop includes switch environment → verify agent work → switch back. If switching environments takes 2 minutes of pulling images and installing dependencies, you'll spend your day waiting. We need to make this "agentic dev loop" a lot quicker.

This is the new version of the xkcd "compiling" comic. Instead of sword-fighting while waiting for the build, developers are scrolling social media while waiting for environments to spin up or agents to finish.

The solution isn't to accept slow switching. We can make switching instant if we design and plan for it.

Local container caching. Don't pull from a remote registry every time. Keep warm images locally.
Pre-warmed environments. If you know you'll need to check on Agent A's work, have its environment ready before you need it.
Shared base layers. All your environments should share a common base so switching is just the diff.
Instant snapshots. Pause an environment and resume it later without cold start.

The goal: switching to any agent's context should take seconds, not minutes. If it doesn't, agents become a coordination tax instead of a productivity multiplier.

Where do agents run code?

Agents don't have laptops. They need an execution environment provisioned for them — typically a cloud VM, a container, or an ephemeral runtime. The agent connects to this environment, runs commands, and reads output.

This introduces a fundamental asymmetry: the agent is remote to its execution environment. There's a network hop between the agent's "brain" (the LLM) and the machine where code runs. This has implications:

Latency matters. Every command the agent runs requires a round-trip. Fast local SSDs become slow network file systems. The inner loop stretches.
State is less visible. The agent can't"see" the dev server in a browser tab. It needs explicit verification — screenshots, API calls, test results.
Environments must be self-describing. The agent can't ask "hey, what version of Node is this?" and interpret a puzzled look. Everything must be introspectable via commands.

When is execution local vs. remote?

This isn't a binary choice. Most teams end up with a spectrum:

Decision Framework

Human in active development

Interactive coding, debugging, exploring

Local

Agent working autonomously

Executing a well-defined task without human presence

Remote

Human + agent pairing

Human directing, agent executing, rapid iteration

Local or low-latency remote

CI / automated verification

Reproducible, isolated, no human involvement

Remote

The key insight: the operator determines the location, not the task. If a human needs to see what's happening in real-time, you want low latency. If an agent is grinding through a task list overnight, latency matters less than isolation and reproducibility.

Bridging human and agent environments

If agents run remotely but humans value local speed, you have a coordination problem: how does a developer step into an agent’s environment without friction?

There are two viable architectural approaches. Both work. What matters is that you choose one deliberately and optimize it.

If you want developers to keep their normal local debugging ergonomics, a strong default is to pull the agent context local. If you want the simplest possible handoff and maximum parity, treat the laptop as a thin client.

Default: Pull the agent context local (recommended)

In this model, agents work remotely by default. When a developer needs to intervene, they rehydrate the agent’s context locally and resume work with their normal inner loop.

The key design move is that you don’t “download a VM.” You pull an environment identity: the minimum set of pins needed to make the local environment behave like the agent’s environment. Then local caches do the heavy lifting.

Repo ref: the branch/commit SHA
Pinned toolchain: image digest (or equivalent)
Task snapshot ID: small, semantic state (workspace diff + key artifacts)
Cache keys: lockfile/input hashes

When finished, they push changes back to the remote runtime and the agent continues.

Developers retain local-speed inner loops.
Environment switching must be fast and deterministic.
Images and dependencies must be cached aggressively.

The risk is slow transfer. If pulling the environment takes minutes, developers will avoid stepping in. The system must make this transition nearly instant (ideally seconds).

This model is covered in more detail in Ephemeral Runtimes, including reset semantics and an “enter context” checklist.

Alternative: The laptop as a thin client

In this model, execution always happens remotely. The laptop runs the editor, but compilation, tests, and servers run on a remote machine. The developer connects over SSH or a remote IDE protocol. Typically this is a long-lived workspace per developer (or per branch), not per task.

The agent connects to that same remote environment. Human and agent operate on the same filesystem, the same running processes, the same state.

No environment transfer is required.
Handoff is just a connection change.
CI parity is easier to maintain.

The tradeoff is latency. Every command, file read, and terminal interaction crosses the network. For some teams, that’s acceptable. For others, it’s not.

The critical point is not which model you choose. The critical point is that switching into an agent’s context must feel cheap.

Agents increase parallel work. Parallel work increases supervision overhead. If stepping into an agent environment takes 90 seconds, developers will do it less often. Bugs accumulate. Trust erodes.

If stepping in takes 5 seconds, supervision becomes natural. Developers check progress casually. They fix small issues early. Agents become collaborators instead of background processes.

Design target: 10 seconds or less to enter any agent context.

You must design for this transition. If you don’t, the friction of environment switching will quietly cap the effectiveness of your entire agent system.

Environment parity with CI

Here's the uncomfortable truth: if your dev environment doesn't match CI, agents will fail in ways that are hard to debug.

When a human hits a "works on my machine" problem, they can investigate. They notice the error, google it, realize their Node version is wrong, fix it. An agent can't do that kind of open-ended debugging — at least not reliably. It will thrash, retry, and eventually give up or produce garbage.

The solution is deterministic environments. Same container image, same lockfiles, same versions — everywhere. This is covered in depth in Guide 03: Reproducible Toolchains, but the principle applies here: execution environments must be identical, whether the operator is a human on their laptop, an agent in the cloud, or a CI runner in GitHub Actions.

Tradeoffs

Local-first

✓Fastest inner loop (no network latency)

✓Works offline

✓Familiar tooling

✗Environment drift from CI

✗Agent handoff requires environment transfer

✗"Works on my machine" problems

Remote-first

✓Environment parity with CI

✓Seamless agent handoff

✓Reproducible by default

✗Network latency on every operation

✗Requires connectivity

✗Cost of running remote machines

Architectural patterns & tools

1. Managed ephemeral environments

On-demand development environments provisioned via API. Agents (and humans) can spin up isolated runtimes per task, discard them after use, and rely on reproducible base images.

This pattern optimizes for isolation, fast spin-up, and automation-friendly workflows.

Examples: exe.dev, sprites.dev

2. Cloud VMs with custom automation

Traditional infrastructure providers where you provision virtual machines directly and build your own environment orchestration layer.

This offers maximum flexibility, but you own lifecycle, image management, isolation, and teardown logic.

Examples: AWS, GCP, Hetzner

3. Local containerized runtimes

Run the same container images locally that you run in CI or production. This supports a local-first workflow while maintaining parity with remote execution.

This pattern works well when paired with aggressive caching and fast environment switching.

Examples: Colima, Docker, Podman

4. MicroVM-based runtimes

Build your own ephemeral runtime platform on top of lightweight virtual machines for stronger isolation and near-container boot times.

Higher engineering investment, but enables fine-grained control over security, startup latency, and multi-tenant agent systems.

Example: Firecracker

5. Remote IDE / thin client tooling

Treat the laptop as a thin client. The editor runs locally for responsiveness, but execution happens entirely on a remote machine. Developers connect via SSH or a remote IDE protocol and operate directly inside the agent’s runtime.

This pattern eliminates environment transfer. Human and agent share the same filesystem, processes, and state — supervision becomes a connection change, not a migration step.

It optimizes for seamless handoff and CI parity, but introduces network latency into the inner loop. Works best when remote environments are provisioned close to the developer and aggressively optimized.

This is the most common way teams implement the “thin client” approach described above.

Examples: VS Code Remote, JetBrains Gateway, GitHub Codespaces

What goes wrong

Environment drift

The agent's environment slowly diverges from what humans test on. Tests pass locally, fail for the agent. Debugging is a nightmare because you can't reproduce the agent's environment.

Latency death spiral

Agent runs every command over SSH. Each command takes 200ms+ of latency. A 50-step task that would take 10 seconds locally takes 10+ minutes. The agent times out or runs up massive costs.

Invisible state

The agent starts the dev server but can't verify it's working. It proceeds to the next step assuming success. Three steps later, everything fails and the agent has no idea why.

Summary

→Humans run code locally for speed. Agents run code remotely by necessity.
→The operator (human or agent) determines whether local or remote makes sense.
→Treating the laptop as a thin client makes agent handoff easier.
→Environment parity with CI is non-negotiable. Drift will break agents.

Related Guides

Ephemeral Runtimes

Should every task run in a fresh environment?

Reproducible Toolchains

Guaranteeing identical behavior across runs

Stay updated

Get notified when we publish new guides or make major updates.
(We won't email you for little stuff like typos — only for new content or significant changes.)

Execution Environments

Core Questions