14 / 25
Writing Specs & Prompts
Core Questions
- How do you write specs agents can execute against?
- When is a prompt enough vs. a structured charter?
- How do you version and review specs before code?
The quality of agent output is bounded by the quality of your input. Vague prompts produce vague results. Precise specs produce precise results. This guide is about writing specs that agents can actually execute against — not just understand, but verify their own work against.
Prompts vs. specs vs. charters
These terms get used interchangeably, but they're different tools for different jobs:
Prompt
A natural language instruction. Ephemeral — typed into a chat, used once. Good for exploration, quick tasks, one-off questions. Not versioned, not reviewed.
"Fix the null pointer exception in the user service"
Spec
A structured description of what to build. Includes requirements, constraints, acceptance criteria. Can be a doc, an issue, a markdown file. Reviewed before implementation starts.
"Implement rate limiting for the /api/users endpoint. Max 100 requests per minute per API key. Return 429 with Retry-After header when exceeded."
Charter
A comprehensive spec that lives in the repo. Includes context, constraints, success criteria, and often verification steps. Versioned with git. The contract between humans and agents for a piece of work.
A markdown file with front matter, background, requirements, non-goals, and acceptance tests.
The progression: prompts → specs → chartersas the stakes increase. A quick fix? Prompt is fine. A feature that will ship to customers? Write a spec. A major system that an agent will build autonomously over days? Charter it.
When is a prompt enough?
Prompts work when the task is small, the risk is low, and the human is in the loop. You're watching the agent work. You can course-correct in real time. The feedback loop is tight.
Prompts work well for
- ✓Quick bug fixes where you know the issue
- ✓Refactoring with clear scope
- ✓Adding tests for existing code
- ✓Documentation updates
- ✓Exploratory coding / prototyping
- ✓Code review assistance
- ✓Debugging with human oversight
- ✓One-off scripts and tools
The test: if the agent produces garbage, how bad is it? If you'll catch it immediately and it's easy to redo, a prompt is fine. If it'll ship to production before anyone notices, write a spec.
Anatomy of a good spec
A spec that agents can execute against has specific components. Not all are required for every task, but the more complex the work, the more you need.
Spec Components
Context
Why does this work need to happen? What problem are we solving? What's the user impact? Agents perform better when they understand the "why" — it helps them make judgment calls.
Requirements
What must be true when this is done? Be specific. "Fast" is not a requirement. "p99 latency under 200ms" is a requirement.
Non-goals
What are we explicitly NOT doing? This prevents scope creep. Agents will often try to be helpful by doing more — non-goals tell them to stop.
Constraints
Technical boundaries. "Must use the existing database schema." "Cannot add new dependencies." "Must be backwards compatible with v1 API."
Acceptance criteria
How do we know it's done? Concrete, testable statements. The agent should be able to verify these itself before declaring completion.
Verification steps
How to test the implementation. Commands to run, endpoints to hit, scenarios to check. The more explicit, the better.
Example: Rate limiting spec
## Context
Users are hitting our API with high frequency, causing performance issues
for other users. We need to add rate limiting to protect the service.
## Requirements
- Implement rate limiting on all /api/* endpoints
- Limit: 100 requests per minute per API key
- Return HTTP 429 when limit exceeded
- Include Retry-After header with seconds until reset
- Rate limit state must survive server restarts (use Redis)
## Non-goals
- Per-endpoint rate limits (all endpoints share the same limit)
- Rate limiting by IP (only by API key)
- Burst allowance / token bucket (simple fixed window is fine)
## Constraints
- Must use existing Redis instance at REDIS_URL
- Cannot modify the API key validation logic
- Must be backwards compatible (existing clients shouldn't break)
## Acceptance criteria
- [ ] Requests within limit return normally
- [ ] Request 101 in a minute window returns 429
- [ ] 429 response includes Retry-After header
- [ ] Rate limit persists across server restart
- [ ] Requests with invalid API key return 401, not 429
## Verification
```bash
# Test normal operation
for i in {1..100}; do curl -H "X-API-Key: test" http://localhost:3000/api/users; done
# Should all return 200
# Test rate limit
curl -H "X-API-Key: test" http://localhost:3000/api/users
# Should return 429 with Retry-After header
# Test persistence
docker restart api-server
curl -H "X-API-Key: test" http://localhost:3000/api/users
# Should still return 429 (rate limit persisted in Redis)
```Charter files
For larger efforts, specs live in the repo as charter files. These are versioned, reviewed, and become the authoritative source for what the agent should build.
Charter file structure
--- # charters/rate-limiting.md title: API Rate Limiting status: in-progress # draft | approved | in-progress | complete owner: [email protected] agent: claude created: 2025-01-15 updated: 2025-01-18 issue: https://github.com/org/repo/issues/42 --- # API Rate Limiting ## Context [Why we're doing this...] ## Requirements [What must be true...] ## Non-goals [What we're NOT doing...] ## Technical approach [How we expect this to be built — optional but helpful for complex work. The agent can deviate if it has good reason.] ## Acceptance criteria [Testable statements...] ## Verification [Commands and tests...] ## Progress - [x] Initial implementation - [x] Unit tests - [ ] Integration tests - [ ] Documentation - [ ] Performance testing
The front matter (YAML between the --- lines) makes charters machine-parseable. You can build tooling to:
- List all in-progress charters
- Link charters to issues and PRs
- Track charter status across repos
- Enforce that PRs reference a charter
Charter Workflow
Draft
Human writes the charter. Gets review from stakeholders. May iterate multiple times before approval.
Approve
Charter is reviewed and approved (via PR). This is the go-ahead for implementation. Status changes to "approved."
Implement
Agent (or human) implements against the charter. PRs reference the charter file. Status is "in-progress."
Verify
All acceptance criteria are checked. Verification steps pass. Status changes to "complete." Charter stays in repo as documentation.
Hard vs. soft constraints
Not all requirements are equal. Some are absolute — violate them and the work is wrong. Others are preferences — follow them unless there's good reason not to. Make this distinction explicit.
Hard constraints (MUST)
- • Must not break existing tests
- • Must maintain backwards compatibility
- • Must use existing auth system
- • Must not add new dependencies
- • Must handle the specified error cases
Violations are blockers. The work is incomplete until fixed.
Soft constraints (SHOULD)
- • Should follow existing code style
- • Should use helper functions from utils/
- • Should add logging for debugging
- • Should include inline comments
- • Should optimize for readability
Deviations are acceptable with justification.
In your specs, use explicit language: MUST, MUST NOT, SHOULD, SHOULD NOT, MAY. This is borrowed from RFC 2119 and agents understand it well.
Writing for agents vs. writing for humans
Agents and humans read differently. Specs that work for one may not work for the other. Some adjustments help:
Agent-Friendly Spec Patterns
Be explicit about file paths
Humans infer from context. Agents work better with explicit paths.
Include example inputs and outputs
Show what you want, don't just describe it.
Specify the testing approach
Agents don't always know your testing conventions.
List what NOT to change
Agents can be overly helpful. Explicit exclusions help.
Reproduce first, then implement
One of the most common agent failure modes: the agent "fixes" a bug it never actually reproduced. It reads the bug report, guesses what's wrong, writes a fix, and declares victory. But the fix doesn't work — because the agent was never looking at the real problem in the first place.
The rule: don't implement until you've reproduced the issue. This applies to humans too, but it's critical for agents because they're more prone to hallucinating their way through ambiguity.
Reproduce → Implement → Demo
This three-step pattern — reproduce, implement, demo — dramatically reduces agent hallucination. The agent is grounded in observable behavior at both ends of the implementation.
Include these steps in your specs:
## Reproduction steps 1. Start the dev server: npm run dev 2. Navigate to /users/123 3. Click "Edit Profile" 4. Clear the email field and click Save 5. EXPECTED: Validation error appears 6. ACTUAL: Page crashes with null pointer exception ## Demo steps (after fix) 1. Run the same reproduction steps above 2. EXPECTED: Validation error "Email is required" appears 3. The form does not submit; no crash occurs ## Verification - [ ] Agent confirmed reproduction before implementing - [ ] Agent demonstrated fix with demo steps - [ ] Unit test added for empty email validation
For more depth on these patterns, see Guide 21: Reproduce Steps and Guide 22: Expect & Demo Steps.
Iterating on specs
Specs aren't static. As the agent works (or as you review its output), you'll find gaps, ambiguities, and things you forgot. Update the spec.
This is especially important for charters. The charter should evolve to reflect learnings. When the work is complete, the charter becomes documentation of what was actually built — not just what was originally planned.
Spec iteration patterns
What goes wrong
Spec is too vague
"Make it better" is not a spec. The agent guesses what "better" means. It guesses wrong. You waste cycles on rework. Be specific about what"better" looks like.
Spec is too detailed
You prescribe every line of code. The agent follows robotically. It doesn't use judgment when it should. The result is technically correct but misses the point. Leave room for agent intelligence.
Acceptance criteria can't be tested
"Should be fast" can't be verified. "Should feel intuitive" can't be verified. The agent declares victory; you disagree. Write criteria the agent can actually check.
Spec diverges from reality
The spec says one thing; the code does another. Neither is updated. Six months later, nobody knows which is right. Keep them in sync or delete the spec.
Summary
- →Use prompts for quick tasks, specs for features, charters for major work.
- →Good specs include context, requirements, non-goals, constraints, and verifiable acceptance criteria.
- →Distinguish hard constraints (MUST) from soft constraints (SHOULD).
- →Write for agents: explicit paths, example I/O, clear scope boundaries.
Related Guides
Stay updated
Get notified when we publish new guides or make major updates.
(We won't email you for little stuff like typos — only for new content or significant changes.)
Found this useful? Share it with your team.