The Agentic Dev Loop — Designing Modern Development Systems for Human + Agent Teams

The Claude Agent SDK provides the primitives for building agentic systems: tool use, multi-turn conversations, structured outputs, and computer use. But the SDK is just the wire. The real work is designing what's on either end — the tools you expose, the constraints you enforce, and the feedback loops you build.

SDK fundamentals

The SDK handles the conversation with Claude. You provide context, tools, and constraints. Claude reasons, calls tools, and produces outputs. Your code orchestrates the loop.

import Anthropic from "@anthropic-ai/sdk";

const client = new Anthropic();

// Basic agentic loop
async function runAgent(task: string) {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: task }
  ];

  while (true) {
    const response = await client.messages.create({
      model: "claude-sonnet-4-20250514",
      max_tokens: 4096,
      system: "You are a coding assistant. Use tools to complete tasks.",
      tools: getTools(),
      messages,
    });

    // Add assistant response to history
    messages.push({ role: "assistant", content: response.content });

    // Check if we're done
    if (response.stop_reason === "end_turn") {
      return extractResult(response);
    }

    // Process tool calls
    if (response.stop_reason === "tool_use") {
      const toolResults = await processToolCalls(response.content);
      messages.push({ role: "user", content: toolResults });
    }
  }
}

Core Loop Components

Messages array

Accumulates conversation history. Each turn adds assistant response and tool results. Context grows until task completes.

Tool definitions

JSON schema describing available tools. Claude decides which to call based on the task. You implement the actual tool logic.

Stop conditions

end_turn means Claude is done. tool_use means process tools and continue. max_tokens means truncated.

Tool results

Execute tools, return results as user message. Claude sees what happened and decides next step.

Designing effective tools

Tools are how Claude interacts with the world. Well-designed tools are focused, predictable, and give useful feedback. Poorly designed tools lead to confusion and errors.

// Good tool design
const tools: Anthropic.Tool[] = [
  {
    name: "read_file",
    description: "Read the contents of a file. Returns the file content as a string.",
    input_schema: {
      type: "object",
      properties: {
        path: {
          type: "string",
          description: "Path to the file, relative to project root"
        }
      },
      required: ["path"]
    }
  },
  {
    name: "write_file",
    description: "Write content to a file. Creates the file if it doesn't exist, overwrites if it does.",
    input_schema: {
      type: "object",
      properties: {
        path: {
          type: "string",
          description: "Path to the file, relative to project root"
        },
        content: {
          type: "string",
          description: "Content to write to the file"
        }
      },
      required: ["path", "content"]
    }
  },
  {
    name: "run_command",
    description: "Execute a shell command. Returns stdout, stderr, and exit code.",
    input_schema: {
      type: "object",
      properties: {
        command: {
          type: "string",
          description: "The command to execute"
        },
        cwd: {
          type: "string",
          description: "Working directory (optional, defaults to project root)"
        }
      },
      required: ["command"]
    }
  }
];

Tool Design Principles

Single responsibility

One tool, one action. read_file reads. write_file writes. Don't combine into read_or_write_file.

Clear descriptions

Claude uses descriptions to decide when to call tools. Be specific about what the tool does and what it returns.

Predictable outputs

Return structured data when possible. JSON over free text. Include success/failure status. Don't surprise the model.

Useful errors

When tools fail, explain why. "File not found: path/to/file.ts" is actionable. "Error" is not.

Processing tool calls

Claude returns tool calls as structured content. Your code executes them and returns results in the expected format.

async function processToolCalls(
  content: Anthropic.ContentBlock[]
): Promise<Anthropic.ToolResultBlockParam[]> {
  const results: Anthropic.ToolResultBlockParam[] = [];

  for (const block of content) {
    if (block.type !== "tool_use") continue;

    const { id, name, input } = block;
    
    try {
      const output = await executeToolSafe(name, input);
      results.push({
        type: "tool_result",
        tool_use_id: id,
        content: JSON.stringify(output),
      });
    } catch (error) {
      results.push({
        type: "tool_result",
        tool_use_id: id,
        content: `Error: ${error.message}`,
        is_error: true,
      });
    }
  }

  return results;
}

async function executeToolSafe(name: string, input: unknown) {
  // Validate and sanitize inputs before execution
  switch (name) {
    case "read_file":
      return await readFileSafe(input.path);
    case "write_file":
      return await writeFileSafe(input.path, input.content);
    case "run_command":
      return await runCommandSafe(input.command, input.cwd);
    default:
      throw new Error(`Unknown tool: ${name}`);
  }
}

System prompts for agents

The system prompt defines agent behavior. It sets the role, constraints, and working style. A good system prompt prevents common mistakes before they happen.

const systemPrompt = `You are a senior software engineer working on a TypeScript codebase.

## Your Role
- Implement features and fix bugs based on task descriptions
- Write clean, tested, production-ready code
- Follow existing patterns in the codebase

## Working Style
1. Read relevant files first to understand context
2. Plan your approach before writing code
3. Make incremental changes, testing as you go
4. Commit logical units of work

## Constraints
- Do not modify files outside the task scope
- Do not add new dependencies without explicit approval
- Do not delete tests unless the tested code is removed
- Ask for clarification if requirements are ambiguous

## Code Standards
- Use TypeScript strict mode
- Follow existing naming conventions
- Write tests for new functionality
- Keep functions under 50 lines

## When Stuck
If you cannot complete a task:
1. Explain what you tried
2. Explain what blocked you
3. Suggest alternative approaches
`;

Principle

Constraints in the system prompt are suggestions, not enforcement

Claude tries to follow system prompt constraints, but it's not guaranteed. Critical constraints need enforcement in code — input validation, tool restrictions, output filtering.

Error handling and retries

Agents fail. API rate limits, tool errors, model mistakes. Robust error handling is essential for production agent systems.

async function runAgentWithRetry(
  task: string,
  maxRetries = 3,
  maxTurns = 20
) {
  let lastError: Error | null = null;

  for (let retry = 0; retry < maxRetries; retry++) {
    try {
      return await runAgentLoop(task, maxTurns);
    } catch (error) {
      lastError = error;
      
      if (error instanceof Anthropic.RateLimitError) {
        // Back off and retry
        await sleep(Math.pow(2, retry) * 1000);
        continue;
      }
      
      if (error instanceof Anthropic.APIError && error.status >= 500) {
        // Server error, retry
        await sleep(1000);
        continue;
      }
      
      // Non-retryable error
      throw error;
    }
  }

  throw new Error(`Failed after ${maxRetries} retries: ${lastError?.message}`);
}

async function runAgentLoop(task: string, maxTurns: number) {
  const messages: Anthropic.MessageParam[] = [
    { role: "user", content: task }
  ];

  for (let turn = 0; turn < maxTurns; turn++) {
    const response = await client.messages.create({
      model: "claude-sonnet-4-20250514",
      max_tokens: 4096,
      system: systemPrompt,
      tools: getTools(),
      messages,
    });

    messages.push({ role: "assistant", content: response.content });

    if (response.stop_reason === "end_turn") {
      return extractResult(response);
    }

    if (response.stop_reason === "tool_use") {
      const results = await processToolCalls(response.content);
      messages.push({ role: "user", content: results });
    }
  }

  throw new Error("Max turns exceeded");
}

Structured outputs

When you need specific output formats, use structured generation. Define a schema and Claude will conform to it.

// Using tool_choice to force structured output
const response = await client.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Analyze this code for bugs..." }],
  tools: [{
    name: "report_analysis",
    description: "Report the code analysis results",
    input_schema: {
      type: "object",
      properties: {
        bugs: {
          type: "array",
          items: {
            type: "object",
            properties: {
              file: { type: "string" },
              line: { type: "number" },
              severity: { enum: ["low", "medium", "high", "critical"] },
              description: { type: "string" },
              suggestion: { type: "string" }
            },
            required: ["file", "line", "severity", "description"]
          }
        },
        summary: { type: "string" },
        overall_quality: { enum: ["good", "needs_work", "critical_issues"] }
      },
      required: ["bugs", "summary", "overall_quality"]
    }
  }],
  tool_choice: { type: "tool", name: "report_analysis" }
});

// Extract structured result
const toolUse = response.content.find(b => b.type === "tool_use");
const analysis = toolUse.input; // Typed to schema

Context management

As conversations grow, you'll hit context limits. Manage context proactively — summarize, truncate, or reset when needed.

Context Management Strategies

Token counting

Track token usage. When approaching limits, take action before hitting them. Use tiktoken or similar for accurate counts.

Sliding window

Keep recent messages, drop old ones. Maintains recency at cost of long-term context. Good for long-running tasks.

Summarization

Periodically summarize conversation history into a compact form. Replace old messages with summary. Preserves key decisions.

Checkpoint and reset

Save state externally, start fresh conversation with state summary. Useful for very long workflows.

Integration with CI/CD

Agents that run in CI need to fit the pipeline model: triggered by events, run to completion, produce artifacts, exit with status codes.

# .github/workflows/agent-task.yml
name: Agent Task

on:
  issues:
    types: [labeled]

jobs:
  run-agent:
    if: github.event.label.name == 'agent-task'
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      
      - name: Setup Node
        uses: actions/setup-node@v4
        with:
          node-version: '20'
          
      - name: Install dependencies
        run: npm ci
        
      - name: Run agent
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          ISSUE_BODY: ${{ github.event.issue.body }}
        run: |
          npm run agent:execute
          
      - name: Upload artifacts
        uses: actions/upload-artifact@v4
        with:
          name: agent-output
          path: artifacts/
          
      - name: Create PR if changes
        if: success()
        run: |
          git config user.name "Agent Bot"
          git config user.email "[email protected]"
          git checkout -b agent/issue-${{ github.event.issue.number }}
          git add -A
          git commit -m "Agent: ${{ github.event.issue.title }}"
          git push origin agent/issue-${{ github.event.issue.number }}
          gh pr create --fill

What goes wrong

Infinite loops

Agent keeps calling tools without making progress. Token costs explode. Task never completes.

Fix: Set max turns. Detect repetitive tool calls. Add convergence detection — if output isn't changing, stop.

Tool misuse

Agent calls wrong tool or passes invalid arguments. Tool fails or does unexpected things.

Fix: Better tool descriptions. Input validation in tool implementation. Return helpful errors so agent can self-correct.

Context window exhaustion

Long task fills context. Model starts forgetting early instructions or files it read. Output quality degrades.

Fix: Monitor token usage. Implement context management. For long tasks, checkpoint and restart with summarized state.

Ignoring constraints

System prompt says don't modify auth code. Agent modifies auth code anyway. Constraint was suggestion, not enforcement.

Fix: Enforce constraints in tool code, not just prompts. Validate tool inputs against allowlists. Reject operations that violate rules.

Summary

•The SDK provides the agentic loop — messages, tools, stop conditions. You provide the logic.
•Design tools with single responsibility, clear descriptions, and useful error messages
•System prompts guide behavior but don't enforce it — enforce critical constraints in code
•Handle errors gracefully: retries for transient failures, useful messages for permanent ones
•Manage context proactively — summarize, truncate, or checkpoint before hitting limits

Stay updated

Get notified when we publish new guides or make major updates.
(We won't email you for little stuff like typos — only for new content or significant changes.)

Building with the Claude Agent SDK

Core Questions