Vibe engineering
Last reviewed
May 24, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 · 4,310 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 24, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 · 4,310 words
Add missing citations, update stale details, or suggest a clearer explanation.
Vibe engineering is a term for the disciplined practice of building production software with the help of large language model coding agents, in contrast to the looser vibe coding style of accepting AI-generated code without close review. The phrase was popularised by British software engineer Simon Willison in an essay published on his personal site on October 7, 2025, where he argued that experienced engineers who pair LLMs with rigorous testing, planning, documentation, version control, and code review can produce maintainable systems while remaining accountable for the result.[1] Willison framed vibe engineering as the opposite end of a spectrum from vibe coding (the term coined by andrej karpathy in February 2025 to describe "fully give in to the vibes" prompt-driven prototyping), reserving the latter for situations in which the human never reads the code.[1][2] Within a few weeks of publication the essay drew 654 points and 719 comments on Hacker News and was widely cited in trade press, industry analyst notes, and an ICSE 2026 technical briefing.[3][4][5] By the first half of 2026 the broader practice it described was equally being called "agentic engineering" (Karpathy's preferred label) or grouped under context engineering, reflecting an unsettled vocabulary for what most observers agreed had become a distinct mode of software work.[6][7]
The phrase "vibe coding" entered the software lexicon at 4:17 PM on February 2, 2025, when Andrej Karpathy, a co-founder of OpenAI and former director of AI at Tesla, posted on X: "There's a new kind of coding I call 'vibe coding', where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It's possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good."[2][8] The post described a workflow in which the developer dictates intent to a coding assistant, accepts diffs without reading them, and pastes any error messages back into the model until the program runs. It accumulated more than 4.5 million views and was named Collins Dictionary's word of the year for 2025.[8][9]
The reception split quickly. Karpathy himself characterised the original message as "a shower of thoughts throwaway tweet that I just fired off."[10] Practitioners adopted it both literally (for rapid prototyping) and dismissively (for code shipped to production without verification). Willison, who runs a widely read blog on language models and is co-creator of the Django web framework, used his Mastodon account in October 2025 to argue that "vibe coding is irresponsibly building software through dice rolls, not caring what code is produced," and asked what the field should call the responsible counterpart.[11] He proposed "vibe engineering" the same day in a thread on X and then in a long-form essay on his site.[12][1]
The essay landed during a period of rapid change in coding agents. Anthropic had released claude code in February 2025, openai codex CLI followed in April, and Google's gemini cli in June; by mid-2025 these tools could iterate on code, run tests, and self-correct rather than simply emit suggestions.[1] Surveys reported that 84 to 92 percent of developers in major markets used AI coding tools at least weekly by late 2025, and a December 2025 analysis of 470 GitHub pull requests by CodeRabbit found that AI-co-authored code contained roughly 1.7 times as many "major" issues as human-only code, including 2.74 times more security vulnerabilities.[13][14] The combination of widespread adoption and a measurable quality gap created an opening for a vocabulary that distinguished careful production work from the original vibe coding ethos.
Willison's October 7, 2025 essay opens by asking what to call "the activity where experienced software engineers use ai coding agents, like Claude Code and Codex CLI, to accelerate the process of building real software." His proposed answer, "vibe engineering," is offered with self-conscious irony: he acknowledges that the "vibes" prefix "feels a little tired" and writes "Is this a stupid name? Yeah, probably."[1] The substantive claim is that "iterating with coding agents to produce production-quality code that I'm confident I can maintain in the future feels like a different process entirely" from the throwaway prototyping that vibe coding describes.[1]
The essay defines vibe engineering as the practice where "seasoned professionals accelerate their work with LLMs while staying proudly and confidently accountable for the software they produce."[1] Two ideas anchor the definition. First, accountability: a vibe engineer signs off on every change, can explain how the system works, and will respond when it breaks. Second, amplification: "AI tools amplify existing expertise. The more skills and experience you have as a software engineer the faster and better the results you can get from working with LLMs and coding agents."[1] Together these claims position vibe engineering as the inverse of the original vibe coding spirit, in which the developer's deliberate stance is to neither read nor own the generated code.
Willison's framing built on positions he had already taken in public. In April 2025 he had told Ars Technica that "vibe coding your way to a production codebase is clearly risky" and that "if an LLM wrote every line of your code, but you've reviewed, tested, and understood it all, that's not vibe coding."[15] The October essay names the alternative.
The bulk of the essay enumerates engineering disciplines that, in Willison's reading, become more valuable when coding agents are part of the workflow rather than less. He presents twelve.[1]
The list is descriptive rather than prescriptive: Willison frames each item as a practice that "good engineers were doing anyway" but which carries an additional payoff when an agent is in the loop. The recurring theme is that the agent's output gets better the more legible the surrounding system is.[1]
The clearest way to read the term is against the practice it negates. Karpathy's February 2025 description treats reading the code as optional and embraces velocity over comprehension: the developer accepts diffs, ignores error traces beyond pasting them back, and "forgets that the code even exists."[2] Willison's vibe engineering insists on the opposite. The engineer reads every change, signs off on every commit, maintains a test suite the agent must pass, and treats the model as a peer whose output requires the same skepticism as a human pull request.[1]
A useful comparison runs across several axes. Vibe coding optimises for first-working-output; vibe engineering optimises for long-term maintainability. Vibe coding suits prototypes, throwaway scripts, and personal hobby work; vibe engineering suits systems that will be on call, audited, or relied on by other engineers. Vibe coding tolerates whatever code emerges so long as it runs; vibe engineering retains the engineer's authority over architecture, naming, and structure. Vibe coding can be done by non-programmers, which is part of why Collins highlighted it as a term that crossed into popular usage; vibe engineering presupposes substantial software engineering experience, because the engineer's expertise is the variable the model amplifies.[9][1]
Willison's own one-line summary captures the symmetry: "Vibe coding is irresponsibly building software through dice rolls, not caring what code is produced. What about when engineers at the top of their game use AI tools responsibly to accelerate their work? I propose 'vibe engineering'!"[12]
Vibe engineering overlaps with spec driven development but is not the same. Spec-driven development, as practiced in workflows associated with GitHub Spec Kit and Amazon's Kiro IDE in 2025, prescribes writing a structured specification (often a markdown document broken into requirements, architecture, and tasks) before generation, and asking the agent to produce code that implements the spec.[16] It is a method with named artifacts and a defined ordering.
Vibe engineering is broader and methodologically agnostic. The Willison essay treats advance planning as one of twelve practices, not the entire framework, and does not require the plan to be expressed in any particular document format. A vibe engineer might use a spec-driven approach for a green-field service and a more conversational, test-first approach for a bug fix in a legacy codebase; both fall under the same label. Where spec-driven development describes a recipe, vibe engineering describes a disposition.
The two practices share an underlying commitment to keeping the human accountable for the result and to using the agent's compute budget on the parts of the work that benefit most from it. Several practitioners in late 2025 treated spec-driven workflows as one expression of vibe engineering, rather than a competitor to it.[16][4]
Within weeks of Willison's essay, a competing label appeared. Karpathy began using "agentic engineering" in talks and interviews through the autumn and winter of 2025, and by his March 2026 Sequoia AI Ascent presentation was framing it as the canonical term for the same practice.[7][17] In December 2025 he stated publicly that roughly 80 percent of the code he wrote was produced by agents rather than by hand, and reported having largely stopped typing code himself, instead orchestrating "fleets" of as many as twenty parallel agents.[7][17] Karpathy's gloss on the new term was that "agentic" captured the shift to orchestrating agents rather than typing code, while "engineering" was meant to emphasise that "there is an art and science and expertise to it."[7]
Willison addressed the competition explicitly. In an October 2025 update at the top of his essay he wrote "It looks like the term 'Agentic Engineering' is coming out on top for this now," and in a follow-up post on his Substack a few weeks later he used "agentic engineering" to title a collection of working patterns.[1][18] Both terms refer to the same underlying practice; the difference is rhetorical, with "agentic" emphasising the role of autonomous agents and "vibe" foregrounding the contrast with the original Karpathy coinage.
A separate but related term is context engineering, the discipline of curating the prompts, files, retrieval results, and tool outputs that an LLM sees in a given turn. Context engineering is more narrowly technical: it covers prompt construction, retrieval strategies, the layout of files like AGENTS.md or CLAUDE.md, and the use of standards such as model context protocol to feed external data into an agent. A November 2025 MIT Technology Review piece by Thoughtworks technologist Ken Mugrage described the year as a shift "from vibe coding to context engineering," casting context engineering as the technical substrate that makes responsible AI-assisted development possible.[6] Vibe engineering and context engineering are complements rather than rivals: a vibe engineer practices context engineering as part of the broader workflow.
The vibe engineering label is tool-agnostic, but the practice is bound up with a specific generation of coding agents that emerged through 2025. Willison's essay names claude code (Anthropic, February 2025), codex cli (OpenAI, April 2025), and gemini cli (Google, June 2025) as the immediate trigger.[1] Each runs as a terminal-based agent that can read and write files, execute shell commands, run tests, and iterate on its own output. Anthropic's Claude Code reached an annualised revenue run rate of approximately one billion US dollars within six months of launch and scored 80.9 percent on swe bench verified by early 2026, the leading public benchmark for resolving real GitHub issues.[14][19]
Vibe engineering also intersects with agentic IDEs. cursor (the AI-first editor that hit a roughly nine billion dollar valuation in early 2026 and crossed a two billion dollar annualised revenue run rate by March 2026) and windsurf are the most cited; both ship multi-file edit, agent loops, and rules files (cursor.md, .cursorrules) that let teams encode project conventions for the agent to follow.[14][20] github copilot (Microsoft/GitHub) remained the largest AI coding tool by paid users, passing 4.7 million paid subscribers in January 2026.[14] Open-source agents such as cline and Aider, along with research systems such as Magentic-One (magentic one), populate the rest of the ecosystem.
Practitioners describe a set of recurring techniques that operationalise Willison's twelve practices.
The role of the human in these flows is consistently described in management terms. Willison calls it "a very weird form of management"; Karpathy describes himself as a "director" of agents; the IBM definition of agentic engineering centres "orchestration."[1][17]
Willison's essay generated 654 points and 719 comments on Hacker News on October 7, 2025, making it one of the most-discussed posts on the site that week.[3] The discussion clustered into four positions.
Supporters embraced the framing. A 68-year-old translator described building a multi-LLM translation system in which models critique each other's work; another commenter outlined a workflow of writing specs, enforcing linting and tests, and reminding the agent of best practices each session, summarising it as "quality in, quality out."[3]
Sceptics objected to the vocabulary. Several commenters argued that "vibe" trivialised serious work and proposed "agentic coding" or simply "AI-assisted software engineering" as alternatives. A 2025 blog post on serce.me, also discussed on Hacker News, was titled "There is no Vibe Engineering" and argued that genuine engineering required tighter specifications than current LLMs could reliably meet.[21]
A third group focused on professional identity. A widely upvoted comment from a developer named deanCommie argued that the shift transformed coding from "craftsmanship" into "technical middle management," removing the flow state that drew many engineers into the field. Other commenters described fatigue from managing multiple parallel agents and questioned whether the productivity gains held up over a full workday.[3]
A fourth group disputed the claim of accountability itself. A commenter using the handle pron contended that outputs which still require constant verification eliminate the automation benefit; another, benterix, described abandoning agents entirely after accumulating review fatigue. Willison himself participated in the thread, restating his core claim that "AI tools amplify existing expertise" and that the gap between novice and experienced users widens, not narrows, in agentic workflows.[3]
Outside Hacker News, the trade press picked up the term within weeks. The New Stack ran multiple pieces on the shift from vibe coding to agentic engineering through late 2025 and early 2026.[17] MIT Technology Review's November 5, 2025 piece by Ken Mugrage described the year as a transition "from vibe coding to context engineering," citing both Willison's essay and the broader retreat from informal prompting.[6] Forrester's 2026 predictions wrote that "vibe coding will transform into vibe engineering by the end of 2026," casting the discipline as a structured replacement for prompt-driven development.[4][22]
Academic uptake followed. The 2026 International Conference on Software Engineering (ICSE) accepted a technical briefing titled "Vibe Engineering: Software Engineering for Software Makers," led by Keheliya Gallaba (Huawei Canada) and colleagues including researchers from Microsoft and Queen's University, to be presented on April 17, 2026 in Rio de Janeiro.[5] The briefing introduces a framework called SE4SM (Software Engineering for Software Makers) that splits the activity into "intent engineering" (multi-agent conversations to refine requirements) and "realization engineering" (an autonomous multi-agent execution environment).[5] Separately, an ICSE 2026 Software Engineering in Practice paper on vibe coding conducted a grey-literature review of motivations, challenges, and outlooks, cementing the academic recognition of the surrounding discipline.[23]
By the first half of 2026, AI-assisted coding had moved from a feature to a default. Industry data placed daily AI tool usage among US developers at roughly 92 percent and put the share of code written with AI assistance at around 41 percent across surveyed teams.[14] Top engineering teams reported daily AI-assistant usage rates of 85 to 90 percent by the end of 2025.[24] claude code reached approximately one billion dollars in annualised revenue within six months of launch, cursor crossed two billion dollars in annualised revenue by March 2026, and github copilot passed 4.7 million paid subscribers in January 2026 (a 75 percent year-over-year increase).[14]
The qualitative shift was equally visible. Karpathy reported in December 2025 that 80 percent of his code was produced by agents, and by March 2026 that he had not typed a line of code by hand since the previous December.[7][17] He framed the change as personally costly ("it's a bit hard on the ego") but operationally irreversible.[7] Sequoia Capital's AI Ascent conference in March 2026 hosted his talk introducing the framing of "Software 3.0," in which the context window becomes "the lever" and the LLM the interpreter.[17]
Quality data tempered the celebration. The December 2025 CodeRabbit analysis of 470 open-source pull requests found that AI-co-authored code contained roughly 1.7 times more "major" issues than human-authored code, with 75 percent higher rates of misconfiguration and 2.74 times higher rates of security vulnerabilities.[14] A Veracode GenAI Code Security Report published the same year reported that nearly half of all AI-generated code introduced known security flaws.[25] These numbers were widely cited as evidence that vibe engineering's emphasis on testing, code review, and accountability addressed a real problem rather than a hypothetical one.
The term and the practice both attracted criticism through 2025 and 2026.
Some critics argued the label was unserious. The "vibe" stem, which originated in Karpathy's deliberately casual tweet, struck several commentators as inappropriate for the disciplined activity Willison was describing; Karpathy's later adoption of "agentic engineering" was read as a deliberate retreat from the original brand.[7] Willison himself conceded the vocabulary was awkward.[1]
Others questioned whether the practice was distinct enough to merit a name. A position represented in the March 2025 Hacker News thread "There is no Vibe Engineering" argued that vibe engineering amounted to "doing software engineering with a faster autocomplete," and that calling careful engineering a new discipline distracted from existing methods.[21]
A third strand of criticism focused on the change in engineering work. Commenters described the loss of "flow state" and immersion that came with hands-on coding, and several worried that career paths into the field would be reshaped if entry-level work shifted toward agent supervision.[3] A widely circulated metaphor compared the change to introducing tractors to a profession of gardeners. The translator who described moving to multi-LLM workflows also wrote that younger entrants to the field might exit it rather than retrain.[3]
A fourth strand concerned safety and quality. The CodeRabbit and Veracode figures suggested that even disciplined teams could not eliminate the elevated defect rate associated with AI-generated code, and a series of incidents through 2025 (including the discovery of a malicious VS Code extension flagged as "vibe-coded" in October 2025) reinforced concerns about provenance and trust in agent output.[26][14]
Several adjacent terms coexisted with vibe engineering in late 2025 and 2026. "Agentic coding" was sometimes used as a near-synonym, sometimes as a narrower label restricted to running coding agents in a terminal. "Software 3.0" appeared in Karpathy's March 2026 Sequoia talk as a generalisation: software 1.0 is hand-written code, software 2.0 is neural network weights, and software 3.0 is programming through prompts.[17] "agentic workflow" denotes any LLM-orchestrated workflow, including non-coding domains.
The phrase "AI engineering" was sometimes used loosely to cover the same territory but more often referred specifically to the discipline of building applications on top of LLMs (prompt design, evaluation, retrieval, agent orchestration), as treated in Chip Huyen's 2024 book of that title. Vibe engineering is narrower: it concerns the engineer's own workflow when shipping software, not the construction of LLM-powered products. The two practices intersect because most modern engineers are doing both.