Spec-driven development (SDD) is a software engineering methodology in which structured specification documents serve as the primary source of truth for a project, with code treated as a generated or verified output of those specifications rather than the artifact that defines intent. The approach gained wide attention from mid-2025 onward as AI coding agents became capable of executing multi-step development tasks autonomously, making the quality and precision of instructions a critical bottleneck. Tools such as Kiro (AI IDE), GitHub Spec Kit, and JetBrains Junie turned the methodology from an academic idea into a practical workflow used by tens of thousands of software teams.
The central claim of spec-driven development is that AI agents perform better when they operate against explicit, structured intent rather than against informal prompts. A specification in this context is not a traditional product requirements document written to inform human readers. It is a machine-readable, version-controlled artifact that contains user stories, acceptance criteria, architectural constraints, task breakdowns, and verification criteria, all written before any code generation begins.
The idea of writing a precise description of a system before building it predates modern software engineering. Daniel McCracken's 1957 book "Digital Computer Programming" encouraged engineers to reason about program behavior before writing code. NASA engineers used test-first techniques on the Mercury project in the 1960s. Formal methods as a discipline emerged in the 1970s, with Edsger Dijkstra's work on program correctness and Tony Hoare's Communicating Sequential Processes providing mathematical frameworks for specifying program behavior. By the 1980s and 1990s, languages such as Z notation, VDM, and Alloy allowed engineers to write machine-checkable specifications for safety-critical systems.
These techniques were powerful but expensive to apply. Writing a formal specification for a commercial software system required specialized training and could take longer than writing the code itself. Outside of aerospace, defense, and medical devices, formal methods saw limited adoption.
A more pragmatic variant appeared in 2004, when Jonathan Ostroff, David Makalsky, and Richard Paige introduced Agile Specification-Driven Development, blending test-driven development (TDD) with design by contract (DbC). Their proposal treated tests and contracts as "different types of specifications that are useful and complementary," allowing teams to capture intent in a lighter-weight form without full formal notation.
Behavior-driven development (BDD), developed by Dan North around 2003 and popularized through frameworks such as Cucumber, took a similar approach. BDD used structured natural language (Given-When-Then scenarios) to express requirements in a form that both humans and automated test harnesses could read. This brought specification closer to the development loop without demanding formal proofs.
When large language models capable of generating substantial code appeared around 2022 and 2023, a new practice emerged that critics later called vibe coding. The term, attributed to AI researcher Andrej Karpathy in early 2025, describes the pattern of giving an AI agent a rough natural language description of desired behavior and iterating through the output until something acceptable appears. Vibe coding proved effective for prototyping and for developers exploring unfamiliar libraries, but it exhibited consistent failure modes when applied to production systems:
By 2025, the accumulation of these problems produced a reaction from the software engineering community. Thoughtworks listed spec-driven development in its 2025 technology radar as a practice worth adopting. DeepLearning.AI launched a short course on spec-driven development with coding agents, taught by JetBrains developer advocate Paul Everitt. Amazon released the Kiro IDE in July 2025 explicitly as an answer to vibe coding. The term entered wide circulation in developer media across Medium, Dev.to, InfoWorld, and Infoq.
The shift toward agentic coding tools amplified the problem that spec-driven development addresses. Earlier tools such as GitHub Copilot worked as autocomplete engines, suggesting single lines or functions. Agents introduced in 2024 and 2025, including Claude Code and the agent mode in Cursor, could read an entire codebase, write files across multiple directories, run tests, and iterate. This longer autonomous run required much more precise upfront instruction. A vague prompt that produced acceptable output when a human reviewed each suggestion became a source of compounding errors when an agent executed hundreds of operations unattended.
Spec-driven development emerged as the discipline layer for agentic coding: a way to make intent explicit enough that agents could operate over long task horizons without diverging from what the developer actually wanted.
Spec-driven development separates the planning phase from the implementation phase. Before any code is generated, a developer or team produces a set of specification documents describing what the system should do, how it should be architected, and what constraints apply. These documents are version-controlled alongside code. When implementation begins, the AI agent reads the specification and generates code against it, treating the spec as the authoritative description of intent.
Augment Code defines a complete specification as containing six elements:
| Element | Description |
|---|---|
| Outcomes | Concrete deliverables, such as "user can sign up, receive verification email, and log in" |
| Scope boundaries | What is explicitly in scope and what is out of scope |
| Constraints and assumptions | Technology stack decisions, API rate limits, performance requirements |
| Prior decisions | Already-selected databases, encryption libraries, and architectural choices |
| Task breakdown | Discrete sub-tasks that allow parallel or sequential agent execution |
| Verification criteria | Acceptance tests, edge case handling, and conditions for considering a task complete |
The methodology inverts the traditional relationship between specifications and code. In conventional software development, a requirements document guides human developers who make their own interpretations and architectural decisions. In spec-driven development, the specification is the source of truth from which code is derived. When there is a conflict between the running code and the spec, the spec wins.
This inversion matters because AI agents fill in ambiguity differently than humans do. A human developer reading an incomplete requirements document will usually ask a clarifying question or make a reasonable assumption based on domain knowledge. An AI agent will make a statistically likely completion that may be wrong and that may be repeated across dozens of generated functions before the error is caught. Precise specifications reduce the surface area for such errors.
Not every project requires the same degree of formality. A 2026 academic paper by Mia Hoffmann published on arXiv (arXiv:2602.00180) identified three levels of specification rigor that teams apply in practice:
| Level | Characteristics | Best suited for |
|---|---|---|
| Spec-first | Spec guides and constrains output but does not automatically generate it | Teams beginning SDD adoption, exploratory work |
| Spec-anchored | Spec governs with checkpoints; constitutional constraints enforced at CI/CD boundaries | Enterprise teams requiring audit trails and compliance documentation |
| Spec-as-source | Specifications become the literal source from which code is generated or validated | API-first domains with mature tooling and stable interfaces |
The most widely adopted version of spec-driven development organizes work into three sequential phases: requirements, design, and tasks. This structure is implemented directly in Kiro and reflected in GitHub Spec Kit's command sequence.
In the requirements phase, a developer provides a natural language description of what they want to build. The AI tool transforms this description into a structured requirements document, typically stored as a markdown file named requirements.md. The document contains user stories written in a standardized format, with each story linked to acceptance criteria.
Kiro uses EARS notation to write these acceptance criteria. GitHub Spec Kit's /specify and /clarify commands serve a similar purpose: they take a high-level feature description and produce explicit requirements that the developer can review, edit, and approve before any architecture work begins.
The requirements document becomes a stable reference that persists across agent sessions. When a developer returns to a project after a break, or when a new team member joins, the requirements document explains what the system is supposed to do without requiring them to read the full codebase or the chat history.
With requirements approved, the AI tool analyzes the existing codebase (or, for new projects, the stated technology constraints) and produces a technical design document, typically stored as design.md. This document covers:
The design phase is where architectural decisions are locked in. By making these decisions explicit and versioned, spec-driven development prevents the common pattern where an AI agent makes an architectural choice on the first run that contradicts a choice made on the second run, producing an incoherent codebase.
In GitHub Spec Kit, the /plan command handles design, translating the spec into concrete architecture, components, and dependencies.
With design approved, the tool generates a task list, typically stored as tasks.md. Each task is a discrete, bounded unit of work: small enough to be executed in a single agent session, specific enough to have clear completion criteria, and linked back to one or more requirements in the requirements document.
Task lists serve two purposes. They make it possible for an agent to work through a complex feature incrementally without losing context between sessions, since each task is self-contained. They also give human reviewers a granular checkpoint for reviewing agent output: rather than reviewing an entire feature at once, a developer can approve each task before the next one begins.
Kiro generates tasks with dependency ordering, ensuring that a task that depends on a database schema is not executed before the schema task is complete. GitHub Spec Kit's /tasks command similarly decomposes work into manageable units, and the /analyze command can validate that the task list is consistent with both the spec and the plan before implementation starts.
EARS (Easy Approach to Requirements Syntax) is a structured notation for writing natural language requirements developed by Alistair Mavin at Rolls-Royce. It was originally designed for safety-critical systems in aerospace and automotive engineering, where ambiguous requirements had serious consequences. Kiro adopted EARS as the format for its requirements phase, which brought the notation to a mainstream software engineering audience.
EARS defines five requirement patterns, each using a specific keyword structure:
| Pattern | Keyword | Example |
|---|---|---|
| Ubiquitous | (none) | "The system shall encrypt all stored passwords using bcrypt." |
| State-driven | While | "While a user is logged in, the system shall display a session timeout warning after 25 minutes of inactivity." |
| Event-driven | When | "When a user submits the registration form, the system shall send a verification email within 30 seconds." |
| Optional feature | Where | "Where the user has enabled two-factor authentication, the system shall require a one-time code at login." |
| Unwanted behavior | If-then | "If the payment gateway returns an error, then the system shall display a human-readable message and preserve the cart." |
By forcing requirements into one of these patterns, EARS eliminates common sources of ambiguity in natural language requirements: passive constructions that omit who performs an action, conditions that are implied rather than stated, and triggers that are vague about timing. The patterns make requirements easier for AI agents to decompose into implementation tasks because each requirement explicitly states the precondition, the trigger, and the expected system response.
The DeepLearning.AI course on spec-driven development and the Thoughtworks analysis of SDD both note that while EARS is not mandatory, using a structured notation for acceptance criteria substantially reduces the rate at which AI agents misinterpret edge cases.
Kiro (AI IDE) is an agentic IDE released by Amazon Web Services on July 14, 2025. It is built on Code OSS (the open-source base of Visual Studio Code) and runs on Amazon Bedrock for model access. Kiro was designed from the ground up to implement spec-driven development, treating the three-phase workflow (requirements, design, tasks) as the primary mode of working rather than an optional add-on.
When a developer opens a new project and describes a feature, Kiro generates a .kiro/specs/ directory containing three markdown files: requirements.md, design.md, and tasks.md. These files are editable, version-controlled, and synchronized with the evolving codebase. If the codebase drifts from the spec, Kiro can detect the discrepancy and flag it.
Kiro also includes two features that complement spec-driven development:
Agent hooks are event-driven automations triggered by file system events such as saves, creations, or deletions. A hook can automatically run tests when a component file is saved, regenerate type definitions when a schema changes, or validate code against a linting rule when a file is created. Hooks replace the ad-hoc shell scripts and GitHub Actions that teams typically assemble to enforce consistency, making automation a first-class feature of the IDE.
Steering files are markdown documents that provide persistent context to Kiro agents across sessions. A steering file can contain coding standards, preferred libraries, architectural principles, or project-specific constraints. Where requirements and design files document what a specific feature should do, steering files document the general rules that apply to all features in a project.
Kiro Powers (launched at AWS re:Invent 2025) bundle Model Context Protocol (MCP) files, steering files, and rules into unified, shareable capability packages. A team can publish a Power that encodes their entire development context, and other developers can activate it with a single command.
Kiro launched with a free tier offering 50 interactions per month, with paid tiers at $20/month (Pro), $40/month (Pro+), and $200/month (Power). New accounts receive 500 bonus credits valid for 30 days regardless of tier. AWS reported 1,500 engineers actively using Kiro within the first months of availability.
GitHub Spec Kit is an open-source toolkit released by GitHub that provides a CLI and template system for spec-driven development. Unlike Kiro, which is a full IDE, Spec Kit is a lightweight layer that works on top of existing AI coding tools including GitHub Copilot, Claude Code, Gemini CLI, and others. This tool-agnostic design lets teams adopt the methodology without switching their primary development environment.
Spec Kit has two main components:
Specify CLI is a command-line tool installed via uv (the Python package manager) with the command uv tool install specify-cli. Once installed, it provides seven slash commands that guide a developer through the spec-driven workflow:
| Command | Purpose |
|---|---|
/constitution | Creates and populates a constitution.md file with non-negotiable project principles |
/specify | Generates a spec.md describing features, pages, and user flows |
/clarify | Refines and resolves ambiguities in the spec |
/plan | Translates the spec into concrete architecture, components, and dependencies |
/tasks | Breaks the plan into manageable, testable work units |
/analyze | Validates consistency across the spec, plan, and task list |
/implement | Passes the finalized spec and tasks to the AI agent for implementation |
Templates are pre-built markdown files for constitution.md, spec.md, and associated planning documents. The templates establish a consistent structure across projects and teams, reducing the cognitive overhead of starting a new spec from scratch.
The constitution.md file is a distinctive feature of GitHub Spec Kit. It captures the non-negotiable technical decisions for a project: the frontend framework, the database choice, the styling library, the API client. These are the decisions that, if violated by an AI agent, would require rework of large parts of the codebase. By stating them in a dedicated file that every agent session reads, Spec Kit prevents agents from re-litigating these choices on each run.
GitHub Spec Kit is compatible with greenfield projects (starting from scratch), creative exploration (testing parallel implementations across different technology stacks), and brownfield development (adding features to existing systems or modernizing legacy code).
JetBrains Junie, the agentic coding tool in JetBrains IDEs, supports a spec-driven workflow through a project constitution and feature specification approach. JetBrains partnered with DeepLearning.AI in late 2025 to produce the "Spec-Driven Development with Coding Agents" course taught by Paul Everitt. The course uses a plan-implement-validate loop as its core workflow and demonstrates spec-driven development within JetBrains IntelliJ.
Claude Code, Anthropic's CLI-based coding agent, supports spec-driven development through CLAUDE.md files (project-level instruction files) and through its ability to read and execute against specification markdown files. Because Claude Code does not provide its own templating or workflow scaffolding, developers using it for spec-driven development typically adopt GitHub Spec Kit templates or write their own specification structure. Claude Code is one of the 28 AI platforms supported by GitHub Spec Kit.
Cursor supports spec-driven workflows through its Rules for AI feature, which allows teams to write persistent instructions that apply to all agent sessions in a project. Developers using Cursor for spec-driven development typically store their requirements, design, and task documents in the project repository and reference them in their Rules configuration. GitHub Spec Kit's tool-agnostic design means its templates work with Cursor's agent mode.
Vibe coding and spec-driven development both use AI agents to generate code, but they differ in where they locate control and documentation.
| Dimension | Vibe coding | Spec-driven development |
|---|---|---|
| Planning phase | None; requirements emerge from iteration | Explicit; requirements, design, and tasks are written before coding |
| Source of truth | Chat history and the running code | Version-controlled specification documents |
| Determinism | Low; same prompt can produce different architectures on different runs | Higher; agent operates against a fixed specification |
| Context persistence | Lost when session ends | Preserved in specification files across sessions |
| Team collaboration | Difficult; intent is in the developer's head | Accessible; specifications are readable by all team members |
| Reviewability | Hard; no document to review against | Clearer; reviewers can check code against spec |
| Overhead | Low initially | Higher upfront; lower over the lifetime of a project |
| Best suited for | Prototypes, MVPs, proof-of-concept work | Production systems, multi-session features, cross-team projects |
| Failure modes | Non-deterministic output, circular debugging, scope creep | Spec drift, documentation overhead, rigidity when requirements change |
Many practitioners treat the two approaches as complementary rather than competing. A common pattern uses vibe coding for initial exploration and prototyping, then transitions to spec-driven development when a concept needs to be hardened for production. As one practitioner put it in the Tessl blog: "Vibe coding optimizes for the first iteration. Spec-driven development optimizes for the next hundred."
Nearform engineer Cian Clarke described the adoption arc as a maturity ladder: copilot-style line-by-line assistance, then agent mode spanning multiple files, then spec-driven development as the practice that removes the productivity ceiling of agent mode by providing the structure agents need to operate reliably over long task horizons.
Spec-driven development superficially resembles waterfall software development in that it involves writing documents before writing code. The resemblance is misleading in two ways.
First, waterfall specifications were written for human readers and human developers. They described requirements in prose, often with deliberate ambiguity left for developers to interpret. SDD specifications are written to be consumed by AI agents that have no tolerance for ambiguity and no domain knowledge to fill gaps. The precision required is closer to formal methods than to a traditional PRD.
Second, waterfall specifications were written once and were expensive to revise, creating long feedback cycles. SDD specifications are living documents, edited alongside code, and regenerated as requirements change. The Thoughtworks analysis of SDD explicitly distinguishes the methodology from waterfall on these grounds: SDD enables shorter, more effective feedback loops precisely because specifications are cheap to revise, whereas waterfall's long cycles were a consequence of heavyweight change control processes, not of specification-first thinking per se.
Reduced agent errors. When agents operate against explicit specifications, they make fewer architectural decisions that contradict each other across sessions. Security vulnerabilities introduced by ambiguous prompts are reduced because the specification encodes constraints that would otherwise be left to the agent's statistical completion.
Persistent project context. Specification files preserve intent across sessions, eliminating the problem of re-explaining a project to an agent that has no memory of previous conversations. This is especially valuable for long-running projects and for teams where multiple developers work with the same agent.
Team alignment. Specification documents give non-engineering stakeholders (product managers, designers, QA engineers) a readable artifact they can review and contribute to. Requirements written in EARS notation are precise enough to derive test cases from directly.
Reduced onboarding time. New team members can read the specification files to understand what the system is supposed to do without reverse-engineering the codebase. Enterprise teams adopting SDD have reported reducing onboarding from months to days in some cases.
Audit trails. For regulated industries, version-controlled specification documents provide a record of intent that compliance teams can review. The spec-anchored pattern explicitly encodes this use case, with specifications serving as contracts that all parties can reference during audits.
Faster cycle times. Teams at companies using spec-driven development have reported API change cycle time reductions of up to 75 percent because incompatibilities are caught at the specification review stage rather than in production. Some enterprise teams have reported compressing six-month feature cycles to six weeks.
Parallel agent execution. Task-level specifications enable safe concurrent agent work. A coordinator agent can dispatch multiple implementor agents to work on independent tasks simultaneously, with a verifier agent checking each output against the specification.
Upfront overhead. Writing a specification before writing code requires time and discipline. For small tasks, one-off scripts, or exploratory work, the overhead of producing a requirements document, a design document, and a task list can easily exceed the time saved. Most practitioners recommend skipping the full spec workflow for changes that can be reviewed in under 15 minutes.
Specification drift. Over time, code and specifications diverge. Updating the code is operationally easier than updating the spec first, and developers under time pressure routinely skip the specification update. The Thoughtworks analysis identifies spec drift as the primary failure mode of SDD in practice. Tools like Kiro attempt to mitigate this by detecting when the codebase diverges from the spec, but no tool eliminates the drift problem entirely.
Rigidity when requirements change. If a project's requirements change significantly mid-implementation, a detailed specification can become an obstacle rather than a guide, requiring substantial rework of all three document layers before implementation can resume.
Exploratory work is harder. Spec-driven development works poorly for research spikes, technical experiments, and work where the requirements genuinely cannot be known in advance. Forcing a specification onto inherently exploratory work produces documents that are either too vague to be useful or are revised so frequently that they create overhead without providing stability.
Tooling immaturity. The current generation of SDD tools is maturing but has rough edges. The arXiv paper by Hoffmann notes that most tools keep specifications co-located with code in a single repository, which works for monolithic projects but breaks down across microservice architectures where a feature spans multiple repositories. Richer cross-repository specification tooling was emerging in 2025 and 2026 but had not yet reached maturity.
Learning curve. A 2025-2026 analysis of enterprise SDD adoption found that 67 percent of teams experienced a productivity dip during the initial adoption period due to the learning curve of writing effective specifications. The typical ROI timeline was three to six months before teams saw measurable productivity gains over their previous workflows.
By mid-2025, spec-driven development had moved from an academic concept to a named, tooled practice. The AWS re:Invent 2025 session on Kiro (DVT209) attracted substantial attention and was summarized widely in developer media. DeepLearning.AI's short course introduced the methodology to thousands of developers through structured instruction. GitHub Spec Kit, which the Augment Code guide described as having approximately 88,000 GitHub stars by early 2026, became a reference implementation for teams that wanted methodology without full tool lock-in.
Enterprise adoption followed a recognizable pattern. Teams at large organizations tended to start with spec-anchored workflows, treating specification documents as audit artifacts for compliance while allowing agents to work with some flexibility. Smaller teams with fewer compliance constraints moved faster toward spec-as-source patterns, where the specification directly generated the code.
The DORA (DevOps Research and Assessment) 2025 report noted 90 percent developer adoption of AI coding tools, with over 80 percent reporting measurable benefits, and identified structured, spec-anchored AI development as the workflow associated with the highest self-reported productivity gains. The DX (Developer Experience) research firm measured 3.6 hours per week saved for developers using structured AI workflows compared to unstructured prompting.
Adoption was concentrated in particular domains. API development, enterprise systems, and embedded software saw the earliest serious SDD adoption because these domains had the longest histories of formal specification work and the clearest existing standards for what a specification should contain. Consumer web applications and mobile apps saw later adoption, in part because their requirements change more rapidly and in part because the tooling was initially more oriented toward backend systems.
By early 2026, job postings at companies with mature AI development practices began listing specification writing and spec review as expected skills, alongside traditional requirements like code review and architectural design experience. Analysts at InfoQ and Thoughtworks projected that SDD would become the default methodology at companies committed to AI-native development by the end of 2026.