# Devin (AI software engineer)

> Source: https://aiwiki.ai/wiki/devin
> Updated: 2026-06-21
> Categories: AI Agents, AI Companies, AI Tools & Products
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
> From AI Wiki (https://aiwiki.ai), the free encyclopedia of artificial intelligence. Reuse freely with attribution to "AI Wiki (aiwiki.ai)".

**Devin** is an autonomous AI software engineering agent developed by [Cognition AI](/wiki/cognition_ai) that plans, writes, tests, debugs, and deploys code inside a sandboxed cloud environment, taking a natural-language task and returning a finished pull request with little human intervention. Founded in 2023 by [Scott Wu](/wiki/scott_wu), Steven Hao, and Walden Yan, Cognition announced Devin on March 12, 2024, marketing it as the "first AI software engineer," equipped with a shell, code editor, and web browser [1]. The announcement generated enormous attention in the AI and software development communities, along with significant controversy about the accuracy of its demonstrated capabilities.

Devin reached general availability in December 2024 at a price of $500 per month for engineering teams, opening with the line "Today we're making Devin generally available starting at $500 a month for engineering teams" [2]. Cognition subsequently launched Devin 2.0 in April 2025 at a dramatically reduced starting price of $20 per month, alongside a major feature overhaul [3]. The company has since grown rapidly, acquiring the [Windsurf](/wiki/windsurf) IDE in July 2025 and raising more than $1 billion at a $25 billion pre-money valuation ($26 billion post-money) in a round that closed on May 27, 2026, led by Lux Capital, General Catalyst, and 8VC [4][34][44].

By May 2026, Devin has merged hundreds of thousands of pull requests across thousands of companies, with its PR merge rate climbing from 34% to 67% year over year [19]. The system has expanded from a single chat-based interface into a multi-product AI development platform encompassing the autonomous Devin agent, the Windsurf IDE, Devin Wiki, and Devin Search, with proprietary SWE-1.5 and SWE-1.6 coding models now powering production deployments at Goldman Sachs, [Citigroup](/wiki/citigroup), and dozens of other large enterprises [35][36].

## What is Devin?

Devin is a cloud-hosted, fully autonomous AI software engineering agent: a user describes a task in plain language, and Devin produces a step-by-step plan, then executes it across files using a terminal, code editor, and browser, reporting progress and ultimately opening a pull request for human review [1]. Unlike autocomplete-style assistants that suggest the next line of code, Devin is designed to own an entire well-scoped engineering ticket end to end, which is why Cognition describes it as a teammate rather than a tool. Devin works best on delegated, junior-level tasks where Cognition advises that "you give Devin tasks that you know how to do yourself" [2].

## Background and founding

### Cognition AI

Cognition AI (originally Cognition Labs) was founded in August 2023 by three former competitive programmers: Scott Wu (CEO), Steven Hao, and Walden Yan. All three founders won gold medals at the International Olympiad in Informatics (IOI), and the team's background in competitive programming shaped the company's approach to building AI systems capable of complex reasoning and problem-solving [5].

Scott Wu, born in 1997 in Louisiana to a Chinese immigrant family, attended Harvard University where he studied economics before leaving the program early. He won three gold medals at the IOI, including a first-place finish in 2014, establishing himself as one of the top competitive programmers in the world [5]. Wu also competed in Mathcounts, winning the individual championship in 2011, and represented Harvard in the 2016 International Collegiate Programming Contest (ICPC), where the team won a gold medal and placed third overall. Before founding Cognition, Wu co-founded and served as CTO of Lunchclub, an AI networking platform backed by Lightspeed and Coatue, from 2017 to 2022 [5].

Steven Hao and Walden Yan brought complementary experience from companies including Jane Street and Google. The team had originally explored cryptocurrency-related projects before pivoting to AI following the release of [ChatGPT](/wiki/chatgpt) and the resulting surge in interest in [large language models](/wiki/large_language_model) [25].

The company emerged from stealth in March 2024 alongside the Devin announcement, immediately positioning itself at the center of the AI-assisted software development movement.

### Early funding

Cognition raised a $21 million Series A round from [Peter Thiel](/wiki/peter_thiel)'s [Founders Fund](/wiki/founders_fund) shortly before the Devin announcement. Just one month later, in April 2024, the company secured an additional $175 million at a $2 billion valuation, again led by Founders Fund [6]. This rapid follow-on funding reflected the intense investor enthusiasm surrounding AI coding tools and the viral reception of Devin's initial demo.

## How Devin works

### Architecture and environment

Devin operates within a sandboxed computing environment that replicates the tools a human software engineer would use daily. This environment includes:

| Component | Function |
|-----------|----------|
| Shell/Terminal | Executes commands, installs packages, runs scripts |
| Code editor | Writes and modifies source code across files |
| Web browser | Reads documentation, searches for solutions, accesses APIs |
| Planner | Breaks tasks into steps and tracks progress |
| Chat interface | Communicates with human users for clarification and status updates |
| Virtual desktop | (Devin 2.2+) Full Linux desktop for testing desktop applications |

Unlike simpler [AI code generation](/wiki/ai_code_generation) tools that provide autocomplete suggestions or respond to single prompts, Devin is designed to handle multi-step engineering tasks autonomously. A user provides a natural language description of a task through a chatbot-style interface, and Devin develops a detailed, step-by-step plan before executing it using its developer tools [1].

### Planning and execution

Devin's core differentiator is its ability to engage in long-term reasoning and planning. For a given task, the system can:

1. Analyze the requirements and break the problem into subtasks
2. Write code across multiple files and languages
3. Execute the code and observe the output
4. Identify errors from stack traces and debugging output
5. Iteratively fix issues until the task is complete
6. Report progress to the user in real time

The agent can recall relevant context at each step of a multi-step process, learning from mistakes made earlier in the session. When working on an existing codebase, Devin indexes the repository to understand the project structure, dependencies, and coding patterns before making changes [1].

### Proprietary models: SWE-1 and SWE-1.5

Cognition develops its own proprietary AI models optimized specifically for software engineering tasks. The SWE model family powers both Devin and the Windsurf IDE.

**SWE-1** was Cognition's first internally developed coding model, designed to handle the multi-step reasoning and tool use required by autonomous software engineering agents.

**SWE-1.5**, announced in October 2025, represents a major leap in both performance and speed [26]. It is a frontier-scale model with hundreds of billions of parameters. Cognition states that "SWE-1.5 is trained on our state-of-the-art cluster of thousands of GB200 NVL72 chips," which may make it the first public production model trained on the GB200 generation of NVIDIA hardware [26].

| Metric | SWE-1.5 Performance |
|--------|---------------------|
| Inference speed | Up to 950 tokens/second |
| Speed vs. Haiku 4.5 | 6x faster |
| Speed vs. Sonnet 4.5 | 13x faster |
| SWE-Bench Pro score | 40.08% |
| Infrastructure partner | Cerebras |

SWE-1.5 achieves near state-of-the-art coding performance while prioritizing speed, a critical factor for autonomous agents that iterate through many steps per task. On [Scale AI](/wiki/scale_ai)'s SWE-Bench Pro benchmark, SWE-1.5 scored 40.08%, ranking second after [Claude](/wiki/claude) Sonnet 4.5's 43.60% [26]. The speed advantage comes from a partnership with Cerebras, whose wafer-scale inference hardware enables the high token throughput. In the same launch post, Cognition argued that benchmarks are an imperfect guide to real-world utility, writing that "performance on coding benchmarks is often not representative of the real-world experience of using an agent, which is why we stopped reporting SWE-Bench numbers in 2024" [26].

In addition to SWE-1.5, Cognition has developed SWE-grep, a specialized model for codebase search and understanding that powers the Devin Search feature.

**SWE-1.6**, which Cognition began previewing in late 2025 and rolled out more broadly through early 2026, was post-trained on the same pre-trained base as SWE-1.5 and runs at the same 950 tokens per second, but scores roughly 11% higher than SWE-1.5 on Scale AI's SWE-Bench Pro and surpasses the top open-source coding models on the same benchmark [37][35]. Cognition described SWE-1.6 as the product of a refined reinforcement-learning recipe, stating that "since training SWE 1.5, we have refined our RL recipe and scaled our infrastructure to unlock two orders of magnitude more compute," including a substantially larger set of training environments and stricter data-quality filtering, and noted that SWE-1.6 "was trained on thousands of GB200 NVL72 chips" [35]. The early-access checkpoint was distributed to a small subset of Devin users so the company could tune behavioral issues such as overthinking and excessive self-verification before a broader release [35]. By mid-2026, SWE-1.6 had become the default model behind both Devin and the Windsurf IDE's Cascade agent, available in a free tier (200 tokens/second) and a fast tier (950 tokens/second), with SWE-1.5 retained as a faster fallback for short, latency-sensitive tasks [37].

#### SWE model lineage

| Model | Released | Pre-training | Post-training focus | Key result |
|-------|----------|--------------|---------------------|------------|
| SWE-1 | Mid-2025 | Cognition base | First in-house coding model | Initial production model for Windsurf and Devin |
| SWE-1.5 | October 2025 | Frontier base (hundreds of billions of params) | Speed and tool-use efficiency | 40.08% on SWE-Bench Pro at 950 tok/s [26] |
| SWE-1.6 (preview) | Late 2025 | Same base as SWE-1.5 | Refined RL recipe, expanded environments | ~11% relative gain over SWE-1.5 on SWE-Bench Pro [35] |
| SWE-grep | 2025 | Specialized | Codebase search | Powers Devin Search and Devin Wiki [26] |

### Devin Agent Preview with Claude Sonnet 4.5

In late 2025, Cognition released a Devin [Agent](/wiki/agent) Preview built on [Anthropic](/wiki/anthropic)'s [Claude Sonnet 4.5](/wiki/claude_sonnet_4_5), offering an alternative to the SWE model family [27]. The integration required significant architectural changes because Sonnet 4.5 operates differently from previous models that Devin was built around.

Key findings from the integration:

- Planning performance increased by 18% compared to the previous Claude-based version
- End-to-end evaluation scores improved by 12%, the biggest jump since Claude Sonnet 3.6
- Sonnet 4.5 executes parallel tool calls (running multiple bash commands and reading several files simultaneously) rather than working sequentially
- The model is notably more proactive about writing and executing short test scripts to create feedback loops

The Devin Agent Preview with Sonnet 4.5 is 2x faster and 12% better on Cognition's Junior Developer Evaluations compared to the previous Devin agent [27].

### Capabilities

At launch, Cognition demonstrated Devin performing a range of software engineering tasks:

- Deploying and improving applications and websites end-to-end
- Finding and fixing bugs in large codebases
- Setting up and fine-tuning large language models using research repositories from GitHub
- Learning unfamiliar technologies by reading documentation
- Completing freelance tasks on platforms like Upwork
- Passing technical interviews at leading AI companies

A Bloomberg test showed that Devin could create a website within approximately ten minutes and recreate a Pong game in a similar timeframe [7].

With production experience from thousands of enterprise deployments, Cognition's 2025 performance review identified specific use cases where Devin delivers the strongest results [19]:

| Use Case | Performance Metric |
|----------|-------------------|
| Security vulnerability fixes | 20x faster than human engineers (1.5 min vs. 30 min average) |
| ETL file migration | 10x improvement (3-4 hours vs. 30-40 hours) |
| Java version migration | 14x less time than a human engineer |
| Test coverage improvement | Typical increase from 50-60% to 80-90% |
| Data features (EightSleep) | 3x more features and investigations shipped |
| Regression testing (Litera) | 40% test coverage increase, 93% faster regression cycles |

## SWE-bench performance

Devin's initial announcement highlighted its performance on [SWE-bench](/wiki/swe_bench), a benchmark that evaluates AI systems on their ability to resolve real GitHub issues from popular open-source Python repositories.

### How did Devin score on SWE-bench at launch?

| System | SWE-bench (unassisted) | Date |
|--------|----------------------------|------|
| Previous state-of-the-art (unassisted) | 1.96% | Pre-March 2024 |
| Previous SOTA (assisted) | 4.80% | Pre-March 2024 |
| Devin | 13.86% | March 2024 |

According to Cognition's SWE-bench technical report, Devin was evaluated on 570 issues, a randomly selected 25% of the full 2,294-issue SWE-bench dataset, and resolved 79 of them, yielding a 13.86% success rate without any human assistance [32]. The report states: "Devin successfully resolves 13.86% of issues, far exceeding the previous highest unassisted baseline of 1.96%" (the prior best being Claude 2 with BM25 retrieval) [32]. Because Devin navigates files on its own rather than being told which files to edit, Cognition argued the result is most fairly compared to the "unassisted" LLM setting, representing roughly a 7x improvement over the best prior unassisted system [1][32].

However, the SWE-bench results came with important caveats that critics were quick to point out. The 13.86% figure, while a genuine improvement, still meant that Devin failed on more than 86% of tasks. Additionally, the evaluation ran against a random 25% subset rather than the full benchmark, and questions arose about task selection and evaluation methodology that Cognition's marketing materials did not fully address [8].

### How have SWE-bench scores changed since 2024?

SWE-bench scores across the industry rose rapidly throughout 2024 and 2025, with multiple competing systems eventually far surpassing Devin's initial benchmark. The major frontier model families ([Claude 4](/wiki/claude_4), [Claude Opus 4.7](/wiki/claude_opus_4_7), and [GPT-5](/wiki/gpt-5)) all reported SWE-bench Verified scores above 70% by late 2025, and [GPT-5.1](/wiki/gpt-5.1) and follow-on Anthropic releases pushed those numbers higher again. By spring 2026, leading systems achieved scores around 80% on [SWE-bench Verified](/wiki/swe-bench_verified):

| System | SWE-bench Verified | Date |
|--------|-------------------|------|
| Claude Opus 4.5 (Anthropic) | ~80.9% | Q1 2026 |
| Claude Opus 4.6 (Anthropic) | ~80.8% | Q1 2026 |
| Gemini 3.1 Pro (Google) | ~80.6% | Q1 2026 |
| Claude Code (Anthropic, with scaffold) | ~78.4% | Q1 2026 |
| OpenAI Codex (GPT-5.1 scaffold) | ~71.0% | Q1 2026 |
| Cursor agent | ~67.2% | Q1 2026 |
| Devin 2.0 (Cognition) | ~60.8% / ~45.8% standard run | 2025-2026 |
| SWE-1.5 (Cognition) | 40.08% (SWE-Bench Pro) | October 2025 |
| Amazon Q Developer | Top scores on SWE-bench Leaderboard | 2025 |

Cognition's reported Devin 2.0 number is approximately 45.8% on SWE-bench Verified under what the company calls a "standard" evaluation: single-agent, no human in the loop, and no best-of-N voting. The higher 60.8% figure circulating in some 2026 leaderboards uses a more aggressive scaffold and multi-attempt setup [38][39]. Cognition itself has publicly distanced its product roadmap from SWE-bench, arguing in its SWE-1.5 launch post that "performance on coding benchmarks is often not representative of the real-world experience of using an agent" and that internal junior-developer evaluations, merge rates, and Agent Compute Unit efficiency are better proxies for production utility [26][19].

The rapid improvement in benchmark scores across the industry underscores how quickly the [AI coding agent](/wiki/ai_coding_agent) landscape has evolved since Devin's initial demonstration. The gap on SWE-bench Verified between Claude-based scaffolds and Devin's standard run also illustrates a broader pattern: scaffolds, prompt strategies, and best-of-N voting can move scores by 15 to 35 percentage points on top of the underlying model, which is one of the reasons Cognition has shifted its public marketing toward enterprise outcome metrics.

## Pricing and plans

Devin's pricing has evolved significantly since its initial launch.

| Version | Date | Starting Price | Details |
|---------|------|---------------|----------|
| Devin 1.0 (GA) | December 2024 | $500/month | Team plan with 250 ACUs, aimed at engineering teams |
| Devin 2.0 | April 2025 | $20/month | Core plan for individuals; Team plan at $500/month/seat + $2.00/ACU |

The original $500/month price point positioned Devin as a premium tool for professional engineering teams. With Devin 2.0, Cognition slashed the entry price by 96%, making the tool accessible to individual developers for the first time [3].

The pricing model revolves around Agent Compute Units (ACUs), where approximately 15 minutes of active Devin work equals one ACU. The Core plan at $20/month includes a limited number of ACUs, while the Team plan provides additional ACUs at $2.00 each plus the per-seat fee [9].

## Controversies and criticism

### Demo accuracy

The initial excitement surrounding Devin's March 2024 announcement was followed by swift backlash as independent developers attempted to reproduce the demonstrated tasks. The most prominent critique came in April 2024 from veteran software developer Carl Brown, who runs the YouTube channel "Internet of Bugs." Brown published a roughly 27-minute video titled "Debunking Devin: 'First AI Software Engineer' Upwork Lie Exposed," in which he walked through Cognition's "Devin's Upwork Side Hustle" demo frame by frame and compared the on-screen actions to the actual Upwork posting and the underlying GitHub repository [10].

According to Brown's analysis, the demo and the underlying task differed in several specific ways:

- The Upwork posting Devin was shown working on was a request for help running an existing computer-vision model, not a request to write or modify code. The customer had asked for setup instructions and configuration help, while Devin's session focused on editing files in the repository.
- Some of the files Devin appeared to "fix" in the demo did not contain the bugs Devin claimed to find. In at least one case, Brown said the file Devin pointed to as broken did not exist in the referenced repository at all, and Devin was working in a file it had created itself rather than one from the original codebase.
- A subset of the errors Devin "discovered and resolved" appeared to have been introduced earlier in the same session by Devin's own actions, so the agent was effectively self-correcting fictional problems rather than fixing pre-existing issues.
- The full session, edited down to roughly two minutes for Cognition's marketing video, ran for several hours of agent time. Brown set up the same task on his own machine and finished it manually in about 36 minutes, which he argued undercut the framing of Devin as a faster-than-human freelancer.

Brown emphasized in the video and in subsequent interviews that he was not opposed to AI coding tools as a category, repeating a line that became widely quoted in coverage of the controversy: "I am not anti-AI, but I really am anti-hype" [10]. The video gathered hundreds of thousands of views within days and was cited extensively in trade press coverage at TechCrunch, Futurism, The Register, and Tweaktown, with Tweaktown picking up the related framing that Devin failed roughly 85% of its assigned tasks based on the SWE-bench Lite figure [33].

Critics argued that the demonstration examples were cherry-picked to present Devin in the most favorable light possible, and that the showcased tasks were significantly simpler than portrayed [10]. As one widely shared analysis put it, "the product that we are sold isn't fully congruent with the product we see" [28].

### Cognition's response

Cognition did not retract the original demo, but the company's public posture shifted noticeably in the months that followed. Scott Wu, in interviews around the time of the Series B announcement, framed the launch video as illustrative rather than as a fully reproducible benchmark, emphasizing that Devin was a research preview and that the SWE-bench technical report was the more rigorous artifact [32]. Cognition's subsequent blog posts focused on benchmark numbers, customer case studies, and product updates rather than on viral demos, and later launches (including Devin 2.0 and the Devin Agent Preview built on Claude Sonnet 4.5) leaned more on quantitative claims about merge rates, ACU efficiency, and junior developer evaluations than on cinematic walkthroughs.

By the time of the December 2024 general availability launch, the company had pulled back from the strongest readings of the "first AI software engineer" line and was instead positioning Devin as a teammate aimed at well-scoped, junior-level engineering work, alongside human reviewers rather than in place of them, advising customers to "give Devin tasks that you know how to do yourself" [2]. The Internet of Bugs critique is now widely cited as a turning point in the public discussion of agentic coding tools, both for sharpening the standard of evidence demanded of demos and for sparking durable skepticism that subsequent agentic releases (including OpenAI Operator and Anthropic's computer use feature, both covered separately below) had to navigate when they shipped their own capabilities.

### Benchmark concerns

Beyond the demo issues, the SWE-bench claims drew scrutiny. While Cognition published a technical report detailing their methodology, independent observers noted that the company's marketing materials emphasized the 13.86% figure without adequately contextualizing the 86% failure rate. The choice to evaluate on a random 25% subset of SWE-bench (570 of 2,294 issues) rather than the full benchmark also raised questions, though partial-subset evaluation was a common practice among AI coding tool developers at the time [8][32].

### Answer.AI evaluation

In January 2025, researchers at Answer.AI published a detailed evaluation of Devin 1.0 across 20 real-world software engineering tasks. The results were sobering [11]:

| Outcome | Count | Percentage |
|---------|-------|------------|
| Failures | 14 | 70% |
| Successes | 3 | 15% |
| Inconclusive | 3 | 15% |

Among the three successful tasks were relatively straightforward operations like pulling a Notion database into Google Sheets and researching how to build a Discord bot in Python. The failures were more concerning: Devin struggled with tasks involving existing codebases, where understanding context and maintaining consistency with established patterns was required. When asked to migrate a Python project to nbdev, for example, Devin could not grasp even basic setup procedures despite having access to comprehensive documentation [11].

The researchers highlighted several recurring problems:

- Tasks that seemed straightforward often took days rather than hours
- Devin would pursue impossible solutions for extended periods rather than recognizing fundamental blockers
- The autonomous nature that was supposed to be an advantage became a liability, as the agent wasted time on dead-end approaches
- Results were unpredictable; even tasks similar to successful ones could fail in complex ways

### "First AI software engineer" claim

The marketing claim that Devin was the "first AI software engineer" drew criticism from the developer community. While Devin represented a step forward in autonomous coding agents, earlier tools such as [GitHub Copilot](/wiki/github_copilot), ChatGPT, and various coding assistants had already been writing code for years. The distinction Cognition drew was that Devin could handle entire engineering workflows autonomously rather than just assisting with individual coding tasks, but many observers felt the "first AI software engineer" label was hyperbolic [8].

## Devin 2.0

Cognition released Devin 2.0 on April 3, 2025, representing a major overhaul of the product [3].

### Key features

**Agent-native IDE**: Devin 2.0 introduced a full IDE experience that resembles Visual Studio Code, moving beyond the chat-only interface of version 1.0. Developers interact with Devin directly within a coding environment rather than solely through a conversational interface.

**Interactive planning**: A core addition that allows developers to begin with broad or incomplete ideas and collaboratively scope out a detailed task plan with Devin. Within seconds of starting a session, Devin analyzes the codebase, identifies relevant files, and proposes an initial plan that developers can refine before execution begins.

**Parallel multi-agent support**: Devin 2.0 supports running multiple Devin agents simultaneously, each with its own isolated workspace. This allows teams to tackle several tasks in parallel.

**Codebase understanding**: Devin now automatically indexes repositories at regular intervals, generating detailed documentation including architecture diagrams and source links. The Devin Search feature allows developers to ask natural-language questions about their codebase and receive cited responses.

**Devin Wiki**: An automatically generated, continuously updated documentation system for codebases. Devin Wiki produces textual explanations, architecture diagrams, and links to relevant source code, giving teams a living reference that stays current as the code evolves [3][16].

**Devin Search**: A natural-language search feature that lets developers query their codebase and receive answers with citations pointing to specific files, functions, and code blocks [3].

**Performance improvements**: According to Cognition's internal benchmarks, Devin 2.0 completes 83% more junior-level development tasks per ACU compared to the 1.x series. Devin's Fast Agent Model (SWE-1.5) allows it to iterate through bugs up to 10x faster than a human engineer [3][17].

### Devin 2.2

Cognition released Devin 2.2 on February 24, 2026, adding several significant capabilities [12]:

| Feature | Description |
|---|---|
| Desktop computer use | Full access to a Linux desktop for launching and testing desktop applications, beyond the browser-only testing in earlier versions |
| Self-reviewing pull requests | Devin Review runs an automated quality pass on every PR, identifying logic errors, missing edge cases, and style violations before human review |
| Video-recorded QA sessions | Devin can request to QA its own PR, run the application, click through the UI on its desktop, and send an edited recording of the testing session for developer review |
| 3x faster startup | Session boot time dropped from approximately 45 seconds to roughly 15 seconds, making Devin viable for quick tasks |
| Linear integration | Smoother Slack and Linear integrations to start sessions without switching context |
| Redesigned interface | Fully overhauled UI connecting every step of the development lifecycle |

The desktop computer-use capability is particularly notable. While Devin has always been able to use a browser to test web applications, version 2.2 gives it access to a full Linux desktop, enabling it to launch, interact with, and test desktop applications as well. This expands the range of tasks Devin can autonomously verify, including interactions with tools like Figma and Photoshop [12].

The Devin Review feature represents a shift toward quality assurance automation. Rather than simply generating code and submitting it for human review, Devin now performs its own first pass at code review, catching logic errors, missing edge cases, and style violations before the pull request reaches a human reviewer [12].

## Windsurf acquisition

In July 2025, Cognition signed a definitive agreement to acquire Windsurf (formerly [Codeium](/wiki/codeium)), an AI-powered integrated development environment [4]. The acquisition, estimated at approximately $250 million, followed a dramatic sequence of events in the AI coding tool space.

[OpenAI](/wiki/openai) had been in talks to acquire Windsurf for roughly $3 billion in April 2025, but the deal collapsed. Google then hired Windsurf's co-founder and CEO Varun Mohan in a $2.4 billion licensing arrangement [13]. Cognition moved quickly, reaching a deal over a single weekend with the remaining Windsurf team and assets.

| Acquisition detail | Value |
|-------------------|-------|
| Estimated price | ~$250 million |
| Windsurf ARR at time of deal | $82 million |
| Enterprise customers | 350+ |
| Employees joining Cognition | ~210 |
| Date signed | July 14, 2025 |

The acquisition gave Cognition access to Windsurf's IDE product, brand, intellectual property, and fast-growing enterprise customer base. All Windsurf employees participated financially in the deal with fully accelerated vesting [4].

### Strategic rationale

The Windsurf acquisition transformed Cognition from a single-product company into a multi-product AI development platform. The combined entity now addresses the full spectrum of AI-assisted development:

| Product | Type | Use Case |
|---|---|---|
| Devin | Fully autonomous agent | Complex, multi-step tasks that can run independently |
| Windsurf IDE | Interactive AI code editor | Real-time coding assistance with developer in the loop |
| Devin Wiki | Documentation tool | Automated codebase documentation |
| Devin Search | Codebase search | Natural-language queries about code |

Following the acquisition, Cognition's combined annual recurring revenue more than doubled, with enterprise ARR growing by more than 30% [14].

## Enterprise adoption

Devin's enterprise traction has grown significantly through 2025 and into 2026, with deployments at major financial institutions, technology companies, government agencies, and other large organizations.

### Goldman Sachs deployment

In July 2025, Goldman Sachs became the first major bank to pilot Devin, marking a significant milestone for autonomous AI agents in enterprise environments. The deployment is notable for its scale and ambitions [22][23]:

- Goldman Sachs is testing Devin across its 12,000-person programming team
- Hundreds of Devin instances were initially deployed, with potential expansion to thousands
- The initiative focuses on repetitive programming tasks including legacy code management, refactoring, and debugging
- Goldman Sachs reported productivity improvements of 3x to 4x compared to previous AI tools
- The bank views Devin as the beginning of a "hybrid workforce" model where AI agents work alongside human engineers

Goldman's CIO described a vision for a near-future "hybrid workforce" where humans and AI coexist, and identified a new type of employee called "AI natives" who would be fluent in managing autonomous agents, delegating tasks, supervising results, and remaining accountable for what the AI delivers [23].

### Citigroup deployment

Less than two weeks after the Goldman Sachs announcement, [Citigroup](/wiki/citigroup) confirmed that it had selected Devin as the centerpiece of an agentic AI rollout to its roughly 40,000-person developer organization, making it the second major Wall Street bank to commit to Cognition's autonomous agent at scale [36]. Tim Ryan, Citi's head of technology, framed the deployment in terms similar to Goldman's hybrid-workforce language, but emphasized a tighter human-in-the-loop pattern in the early phases.

Citi's initial Devin workloads were deliberately scoped to deterministic, easily testable tasks:

| Citi use case | Typical workflow |
|---|---|
| Library and dependency upgrades | Devin scans repositories, identifies impacted files, runs automated tests, and presents a diff for human approval |
| Middleware patching | Devin applies vendor-issued patches across hundreds of services and surfaces regressions before deployment |
| Cross-language code rewrites | Devin translates legacy modules between languages (for example, Java to Kotlin, or VB to C#) with reviewer sign-off |
| Code reviews | Devin performs a first automated review pass before pull requests reach a human reviewer |

In April 2026, Citi unveiled a centralized agentic-AI platform called Arc, described internally as an "operating system" for AI agents that orchestrates Devin alongside other autonomous tools across the bank's engineering organization [40]. Arc rolls out to developers first and is expected to extend to broader business units later in 2026. Citi's chief information officer told American Banker that the bank expects Devin's task scope to expand from "simple but tedious" upgrades into more complex multi-service refactors as confidence in agent outputs grows [36][40].

The back-to-back commitments from Goldman Sachs and Citi turned Wall Street into Cognition's most visible reference market and were widely cited as a key signal driving the 2026 funding round that valued the company at $25 billion [34][41][44].

### Enterprise customer list

As of early 2026, Devin and Windsurf power engineering teams at category-defining organizations [14][17]:

| Sector | Customers |
|---|---|
| Financial services | Goldman Sachs, Citi, Santander, Nubank |
| Technology | Dell, Cisco, Palantir, Ramp |
| E-commerce/Retail | Mercado Libre |
| Defense | Anduril, Sierra Nevada Corporation (SNC) |
| Government | U.S. Army, U.S. Navy, Treasury Department, NASA-JPL |
| Consulting | Cognizant |

### Cognizant partnership

On January 28, 2026, Cognizant announced a strategic partnership with Cognition to deploy Devin and Windsurf across its engineering organization and global client base [29]. The partnership combines Devin and Windsurf with Cognizant's Flowsource platform, a unified full-stack engineering tool designed to integrate generative and [agentic AI](/wiki/ai_agents) across the software development lifecycle.

Cognizant reported that 30% of its code was already generated with AI at the time of the partnership announcement, with an aim to reach 50% in the near future. The initial focus areas are enterprise modernization and engineering transformation programs [29].

### Government deployment

On February 25, 2026, Cognition launched "Cognition for Government," a program aimed at modernizing critical defense and civilian software systems [24][30]. The program targets a specific pain point: of the $100 billion the U.S. government spends annually on IT, nearly 80% goes to maintaining existing systems rather than building new ones. Only 3 of 10 critical legacy systems flagged by the Government Accountability Office (GAO) in 2019 have been modernized.

Government-specific details:

| Feature | Status |
|---------|--------|
| Windsurf IDE | FedRAMP High authorized; DoD IL4/5/6 accredited; SOC 2 Type II certified |
| Devin | Available in AWS GovCloud; FedRAMP High authorization forthcoming |
| Data retention | Zero data retention policy; CUI and ITAR compliant |
| Deployment options | Cloud and on-premises |

Cognition reports that Devin can complete migrations 5 to 40x faster than human engineers. Common government use cases include SAS to PySpark migrations, COBOL modernization, Angular to React conversions, .NET Framework to .NET Core upgrades, and migrating off proprietary frameworks [30].

### Task suitability

Based on 18 months of production data, Cognition has identified the types of tasks where Devin performs best versus where it struggles:

| Task Type | Devin Performance | Notes |
|---|---|---|
| Well-defined, scoped tasks | Strong | Clear requirements with verifiable outcomes |
| Junior-level tasks (4-8 hours of human work) | Strong | Sweet spot for autonomous completion |
| Bug fixing in familiar codebases | Moderate to strong | Effective when patterns are clear |
| Security vulnerability remediation | Strong | 20x faster than human average |
| Code migration | Strong | 5-40x faster depending on complexity |
| Legacy code refactoring | Moderate | Improves with Devin Wiki context |
| Test coverage expansion | Strong | Routinely increases coverage from 50-60% to 80-90% |
| Complex architectural decisions | Weak | Requires human judgment |
| Open-ended exploration | Weak | Tends to pursue dead ends |
| Large, unfamiliar codebases | Moderate | Improved significantly with 2.0 codebase indexing |

## Funding history

Cognition's funding trajectory reflects the intense competition and high valuations in the AI coding space.

| Round | Date | Amount | Valuation | Lead Investor |
|-------|------|--------|-----------|---------------|
| Series A | Early 2024 | $21M | Undisclosed | Founders Fund |
| Series B | April 2024 | $175M | $2B | Founders Fund |
| Growth round | Early 2025 | Undisclosed | ~$4B | 8VC |
| Series C | August 2025 | ~$500M | $9.8B | Founders Fund |
| Post-Windsurf round | September 2025 | $400M | $10.2B | Founders Fund |
| Series E | May 2026 | $1B+ | $25B pre-money / $26B post-money | Lux Capital, General Catalyst, 8VC [44] |

Peter Thiel's Founders Fund has led multiple rounds, with additional participation from Lux Capital, 8VC, Bain Capital Ventures, Neo, Elad Gil, Definition Capital, D1 Capital, and Hanabi Capital [14][17]. Including the May 2026 round, Cognition has raised well over $1.5 billion across its closed financings.

### Path to a $25 billion valuation

In late April 2026, Bloomberg and SiliconANGLE reported that Cognition was in early talks to raise a new round at a $25 billion valuation, roughly 2.5 times its $10.2 billion mark from September 2025 and more than 12 times the $2 billion Series B price set just two years earlier [34][41]. The round closed on May 27, 2026: TechCrunch reported that "Cognition, the makers of the autonomous AI software engineer named Devin, has raised more than $1 billion at a $25 billion pre-money valuation ($26 billion post money)," with Lux Capital, General Catalyst, and 8VC leading the financing [44].

Investors and analysts attributed the step-up in valuation to four converging factors:

- The Goldman Sachs and Citi flagship deployments, which validated Devin in highly regulated, security-sensitive enterprise environments
- A reported eightyfold year-over-year increase in enterprise Devin usage, against a backdrop of doubling weekly session counts every six weeks [19][34]
- Combined Cognition/Windsurf enterprise ARR growth of more than 30% in the seven weeks immediately following the Windsurf acquisition close, with continued momentum through Q1 2026 [14][17]
- Industry-wide repricing of AI coding companies, after Anysphere's Cursor crossed $2 billion in annualized revenue and Anthropic's Claude Code became the top-rated agent in 2026 developer surveys [15][18]

The $25 billion mark placed Cognition in the same valuation bracket as Anysphere and well above other autonomous-agent peers, even as critics noted that the underlying Devin product still resolves only a minority of SWE-bench Verified tasks under its own standard evaluation [38].

### Revenue growth

| Metric | Value | Date |
|---|---|---|
| Devin ARR | ~$1M | September 2024 |
| Devin ARR | ~$73M | June 2025 |
| Windsurf ARR (at acquisition) | $82M | July 2025 |
| Combined estimated ARR | ~$150M+ | Late 2025 |
| Total net burn (company history) | Under $20M | Through mid-2025 |

Cognition's capital efficiency is notable: the company achieved $73M ARR with total net burn under $20M across its entire history, one of the most efficient growth trajectories in enterprise software [17].

### Internal adoption metrics

Cognition uses Devin extensively for its own internal development. In their best week of 2025, Cognition's team merged 154 Devin-generated PRs internally. By early 2026, that figure had grown to 659 PRs in a single week, with total Devin sessions per week across all enterprise customers doubling in just six weeks [19].

## Competition

Devin operates in a crowded and rapidly evolving market for AI-powered development tools. Its competitive position is shaped by the distinction between fully autonomous agents and interactive coding assistants.

### Autonomous agents vs. interactive assistants

Devin's primary differentiator is its autonomous operation: given a task, it plans, executes, and delivers results with minimal human intervention. Most competing tools take a more interactive approach, assisting developers in real time as they write code rather than working independently.

| Tool | Developer | Type | Pricing (starting) |
|------|-----------|------|----------|
| Devin | Cognition AI | Autonomous agent | $20/month |
| [Claude Code](/wiki/claude_code) | [Anthropic](/wiki/anthropic) | Terminal-based agent | Included with API usage |
| [Cursor](/wiki/cursor) | Anysphere | AI-powered IDE | Free (Hobby) |
| GitHub Copilot | GitHub/[Microsoft](/wiki/microsoft) | IDE extension + agent mode | $10/month |
| Windsurf | Cognition AI | AI-powered IDE | $15/month |
| [Amazon Q Developer](/wiki/amazon_q) | [Amazon Web Services](/wiki/amazon_web_services) | IDE extension + agent | Free tier available |
| SWE-Agent | Princeton NLP | Open-source agent | Free (open source) |
| [Augment Code](/wiki/augment_code) | Augment | AI coding platform | Enterprise pricing |

### Claude Code

[Claude Code](/wiki/claude_code), released by Anthropic in 2025, operates as a terminal-based coding agent that can handle complex multi-file refactors, debug subtle architectural issues, and navigate unfamiliar codebases. By early 2026, Claude Code had achieved a 46% "most loved" rating among developers, making it the top-rated AI coding tool in developer surveys [15]. On SWE-bench Verified, Claude Code's published scaffold scores approximately 78.4%, well above Devin's standard-evaluation 45.8% and its scaffolded 60.8% result [38][42]. Unlike Devin, Claude Code runs locally in the developer's terminal rather than in a cloud-based sandbox, giving developers more direct control over the agent's actions. The two products are increasingly framed as complementary rather than competitive: Claude Code is positioned as a pair-programming partner for human-supervised work, while Devin specializes in delegated, asynchronous tasks where the developer reviews only the final pull request. Cognition itself ships a Devin Agent Preview that uses [Claude Sonnet 4.5](/wiki/claude_sonnet_4_5) as the underlying model, blurring the boundary between the two ecosystems [27].

### Cursor

[Cursor](/wiki/cursor), developed by Anysphere, is an AI-powered code editor built as a fork of Visual Studio Code. It offers Tab autocomplete, chat, multi-file editing, and agent mode with background agents. Cursor surpassed $2 billion in annualized revenue by early 2026 and is used by over half of the Fortune 500 [18]. Cursor's agent mode posts a SWE-bench Verified score around 67.2% with its default scaffold, sitting above Devin's standard-run number but below Claude Code's scaffolded result [38][42]. Cursor takes a more interactive approach than Devin, keeping the developer closely involved in the coding process. Cognition and Anysphere thus occupy opposite ends of the AI coding spectrum: Cursor optimizes for the developer-in-the-loop case where the human authors each commit, while Devin optimizes for the asynchronous case where the human reviews only the final result.

### GitHub Copilot

GitHub Copilot, backed by Microsoft's distribution and GitHub's developer ecosystem, remains the most widely deployed AI coding assistant. In 2025 and 2026, GitHub expanded Copilot with agent mode capabilities and a coding agent that can work on assigned issues autonomously, narrowing some of the feature gaps that initially distinguished tools like Devin [15]. The Copilot coding agent became generally available in 2026, enabling asynchronous autonomous development within the GitHub ecosystem.

### Amazon Q Developer

[Amazon Q Developer](/wiki/amazon_q), Amazon Web Services' AI coding assistant, has emerged as a strong competitor particularly in enterprise environments already using AWS. [Amazon Q](/wiki/amazon_q) Developer achieved top scores on the SWE-bench Leaderboard for its agentic capabilities, with features for implementing code across multiple files, generating tests, documentation, and automated code reviews [31].

### SWE-Agent

SWE-Agent, developed by Princeton University's NLP group, is an open-source autonomous coding agent that achieved 12.29% on SWE-bench shortly after Devin's 13.86% result in 2024. Notably, SWE-Agent ran significantly faster, completing tasks in an average of 93 seconds compared to Devin's 5 minutes [32]. The open-source nature of SWE-Agent contributed to rapid community-driven improvements in the autonomous coding agent space.

### General-purpose computer-use agents

Devin sits at the agentic end of the AI coding spectrum, but it is not the only agent that takes actions in a sandboxed environment. [OpenAI Operator](/wiki/openai_operator), released as a research preview in January 2025, runs in a hosted browser and is positioned more as a generalist web automation agent than as a coding tool, although users have demonstrated it filing GitHub issues, browsing documentation, and triggering CI runs. [Anthropic's computer use](/wiki/anthropic_computer_use) capability, introduced in October 2024 with Claude 3.5 Sonnet, lets the model click, type, and read pixels on a virtual desktop and is more often used as a building block by other developers than as an end-user product. Cognition's own demos with Sonnet 4.5 inside Devin's Linux desktop in version 2.2 are conceptually closer to Anthropic's computer-use approach than to a pure coding assistant [12][27].

### Browser-first builders: Replit Agent, Bolt, v0

A separate category of tools focuses on rapid prototyping rather than on long-running engineering tasks: Replit Agent (launched in September 2024 inside Replit's online IDE), Bolt by StackBlitz, and Vercel's v0. These products typically generate full applications from a prompt, run them in a containerized preview, and let users iterate by chatting with the agent. They occupy a niche adjacent to Devin but emphasize design-driven, web-app prototyping rather than the legacy code migrations and bug-fixing workflows that dominate Devin's enterprise deployments.

| Tool | Vendor | Focus | Typical task length |
|------|--------|-------|---------------------|
| Devin | Cognition AI | Autonomous engineering on production codebases | Hours to days |
| Claude Code | Anthropic | Terminal agent for human-supervised coding | Minutes to hours |
| Cursor (background agents) | Anysphere | IDE with optional agent mode | Seconds to minutes |
| GitHub Copilot (agent mode) | GitHub/Microsoft | IDE assistant plus issue-attached coding agent | Minutes to hours |
| OpenAI Operator | OpenAI | Generalist browser-based web agent | Minutes |
| Anthropic computer use | Anthropic | Pixel-level desktop automation primitive | Variable |
| Replit Agent | Replit | Browser-based app builder | Minutes |
| Bolt | StackBlitz | Web app prototyping in WebContainers | Minutes |
| v0 | Vercel | UI generation from natural language | Seconds to minutes |

### Developer reception and sentiment

Developer sentiment toward Devin has evolved significantly since the controversial launch:

| Period | Sentiment | Key Driver |
|---|---|---|
| March 2024 (launch) | Highly skeptical | Demo accuracy concerns, "first AI software engineer" backlash |
| Late 2024 (GA) | Cautious | $500/month price barrier, Answer.AI evaluation showing 70% failure rate |
| April 2025 (2.0 launch) | Improving | Price reduction to $20/month, IDE interface, improved performance |
| Late 2025 (Windsurf acquisition) | Mixed positive | Combined product strategy, enterprise traction |
| Early 2026 (2.2 release) | Moderately positive | Desktop computer use, self-reviewing PRs, Goldman Sachs adoption |
| Spring 2026 ($25B raise) | Polarized | Wall Street rollouts and SWE-1.6 gains balanced against continued benchmark gap with Claude Code and ongoing Windsurf pricing complaints [34][43] |

As of February 2026, Windsurf ranked #1 in the LogRocket AI Dev Tool Power Rankings, ahead of both Cursor and GitHub Copilot, suggesting that Cognition's combined product portfolio is resonating with developers [21].

## Devin's performance review (2025)

In late 2025, Cognition published "Devin's 2025 Performance Review," reflecting on 18 months of operating coding agents in production environments [19]. The review provided a candid assessment of both achievements and ongoing challenges.

### Key achievements

- Devin merged hundreds of thousands of PRs across thousands of companies
- PR merge rate improved from 34% to 67% year over year
- Problem-solving speed increased 4x
- Resource efficiency improved 2x
- Codebase understanding improved substantially, which Cognition identified as a primary driver of the doubled merge rate

### Enterprise results

The performance review included specific customer metrics:

| Customer/Use Case | Result |
|---|---|
| Large organization (security fixes) | 5-10% savings of total developer time; Devin completes vulnerabilities in 1.5 min vs. 30 min human average |
| Bank (ETL migration) | Tasks completed in 3-4 hours vs. 30-40 hours for humans |
| Java version migration | 14x less time than human engineers |
| Nubank (data migrations) | 12x efficiency improvement in engineering hours; 20x cost savings |
| EightSleep | 3x more data features and investigations shipped |
| Litera | 40% test coverage increase; 93% faster regression cycles |

### Acknowledged limitations

The review also acknowledged that autonomous agents face inherent challenges in real-world software development, including:

- Difficulty with large, complex codebases where understanding context is critical
- Variability in output quality across different types of tasks
- The need for human oversight to catch errors that autonomous systems miss
- Long execution times for tasks that a skilled developer could complete quickly

Cognition stated that in 2026, it would continue to focus on making Devin better at understanding real-world codebases, investing in UX to make Devin easier to direct, and expanding the range of tasks the agent can handle autonomously [19].

## Industry impact

### Workforce implications

Devin's launch and subsequent enterprise adoption have intensified the ongoing debate about AI's impact on software engineering jobs. Goldman Sachs' adoption of Devin specifically raised questions about whether autonomous coding agents would reduce demand for human developers [23].

The prevailing view among industry leaders by early 2026 is that AI coding agents will shift the nature of software engineering work rather than eliminate it entirely. Goldman Sachs emphasized that Devin would handle "drudgery" tasks like updating internal code to newer programming languages, freeing human engineers for higher-level work. However, business leaders have also warned that early-career roles may be most vulnerable, with some predicting AI could significantly reduce the number of entry-level programming positions within five years [23].

At Nubank, engineers were able to delegate migrations to Devin and achieve a 12x efficiency improvement in engineering hours saved and over 20x cost savings, suggesting that the impact is concentrated in repetitive, well-defined tasks rather than creative or architectural work [19].

### Broader market trends

Devin's launch in March 2024 is widely credited with accelerating investment and development in the AI coding agent space. The viral demo, despite its controversies, demonstrated the concept of a fully autonomous software engineering agent to a broad audience and prompted competitors to accelerate their own agent capabilities.

By early 2026, real-world data suggests that AI coding assistants deliver 20-30% productivity gains concentrated in specific workflows, with benefits varying by developer experience and team size [31]. This is notably more modest than the 10x productivity improvements initially promised during the hype cycle of 2023-2024, but still represents meaningful value for engineering organizations.

## Current state (2026)

As of mid-2026, Cognition AI operates both the autonomous Devin agent and the Windsurf IDE, positioning itself as a company that addresses the full spectrum of AI-assisted development: from interactive coding assistance (Windsurf) to fully autonomous task completion (Devin).

The company raised more than $1 billion at a $25 billion pre-money valuation ($26 billion post-money) in a round that closed on May 27, 2026, led by Lux Capital, General Catalyst, and 8VC, bringing its total closed financing to well over $1.5 billion [34][41][44]. With the Windsurf acquisition providing a large enterprise customer base and established ARR, Cognition has evolved from a single-product startup into a multi-product AI development platform. The company's combined portfolio serves enterprise customers across financial services, technology, e-commerce, defense, and government, with Goldman Sachs and Citi serving as flagship Wall Street deployments [22][36].

Recent product developments include Devin 2.2's desktop computer-use capabilities, self-reviewing pull requests, 3x faster startup times, and Linear integration. The Windsurf IDE continues to be developed with proprietary models including SWE-1.5 and SWE-1.6, which together deliver up to 950 tokens per second through a partnership with Cerebras while pushing SWE-Bench Pro scores above the leading open-source coding models [26][35][37]. Cognition for Government, launched in February 2026, brings Devin and Windsurf to federal agencies and defense contractors, with FedRAMP High authorization for Windsurf and a forthcoming FedRAMP High version of Devin.

The Cognizant partnership, announced in January 2026, represents a new go-to-market channel through one of the world's largest IT services companies, with Cognizant integrating Devin and Windsurf into its engineering transformation offerings for global enterprise clients. By spring 2026, Cognition reported that enterprise Devin usage had grown roughly eightyfold year over year, with combined Cognition and Windsurf ARR more than doubling in the months following the acquisition close [34][14].

The broader market for AI coding tools continues to expand rapidly. Claude Code, Cursor, GitHub Copilot, and a growing number of competitors are all investing heavily in agent capabilities. The central question for Devin and autonomous coding agents in general remains whether full autonomy or human-in-the-loop collaboration will prove to be the dominant paradigm for AI-assisted software development.

Devin's journey from a viral demo to a production tool used by thousands of companies illustrates both the promise and the challenges of autonomous [AI agents](/wiki/ai_agents). While the initial marketing claims were widely seen as overstated, the underlying technology has improved substantially. The concept of AI agents that can independently complete software engineering tasks has become a mainstream area of investment and development across the industry, with Devin playing a central role in shaping both the technology and the discourse around it.

## See also

- [Hark](/wiki/hark_ai)
- [Sekai](/wiki/sekai)
- [NeoCognition](/wiki/neocognition)
- [Rox](/wiki/rox)
- [Wispr (Wispr Flow)](/wiki/wispr)
- [AI code generation](/wiki/ai_code_generation)
- [AI agents](/wiki/ai_agents)
- [Cursor](/wiki/cursor)
- [GitHub Copilot](/wiki/github_copilot)
- [Claude Code](/wiki/claude_code)
- [SWE-bench](/wiki/swe_bench)
- [Large language model](/wiki/large_language_model)
- [Cognition AI](/wiki/cognition_ai)

## References

1. [Introducing Devin, the first AI software engineer - Cognition AI](https://cognition.ai/blog/introducing-devin)
2. [Devin is now generally available - Cognition AI](https://cognition.ai/blog/devin-generally-available)
3. [Devin 2.0 is here: Cognition slashes price of AI software engineer to $20 per month from $500 - VentureBeat](https://venturebeat.com/programming-development/devin-2-0-is-here-cognition-slashes-price-of-ai-software-engineer-to-20-per-month-from-500)
4. [Cognition's acquisition of Windsurf - Cognition AI Blog](https://cognition.ai/blog/windsurf)
5. [Scott Wu - Wikipedia](https://en.wikipedia.org/wiki/Scott_Wu)
6. [Cognition AI Raises $175M at $2B Valuation, One Month After Series A - Maginative](https://www.maginative.com/article/cognition-ai-raises-175m-at-2b-valuation-one-month-after-series-a/)
7. [Cognition Labs Previews Devin AI Software Engineer - DevOps.com](https://devops.com/cognition-labs-previews-devin-ai-software-engineer/)
8. [Is Devin Fake? The Discussion Continues In The Dev Community - Codemotion](https://www.codemotion.com/magazine/ai-ml/is-devin-fake/)
9. [Devin Pricing - Devin.ai](https://devin.ai/pricing/)
10. [Debunking Devin: 'First AI Software Engineer' Upwork Lie Exposed - Hacker News discussion](https://news.ycombinator.com/item?id=40008109)
11. [Thoughts On A Month With Devin - Answer.AI](https://www.answer.ai/posts/2025-01-08-devin.html)
12. [Introducing Devin 2.2 - Cognition AI Blog](https://cognition.ai/blog/introducing-devin-2-2)
13. [Cognition to buy AI startup Windsurf days after Google poached CEO in $2.4 billion licensing deal - CNBC](https://www.cnbc.com/2025/07/14/cognition-to-buy-ai-startup-windsurf-days-after-google-poached-ceo.html)
14. [Cognition valued at $10.2 billion two months after Windsurf purchase - CNBC](https://www.cnbc.com/2025/09/08/cognition-valued-at-10point2-billion-two-months-after-windsurf-.html)
15. [Best AI Coding Agents for 2026: Real-World Developer Reviews - Faros AI](https://www.faros.ai/blog/best-ai-coding-agents-2026)
16. [Devin 2.0 Explained: Features, Use Cases, and How It Compares to Windsurf and Cursor - Analytics Vidhya](https://www.analyticsvidhya.com/blog/2025/04/devin-2-0/)
17. [Cognition: Funding, growth, and the next frontier of AI coding agents - Cognition AI](https://cognition.ai/blog/funding-growth-and-the-next-frontier-of-ai-coding-agents)
18. [Cursor's Anysphere nabs $9.9B valuation, soars past $500M ARR - TechCrunch](https://techcrunch.com/2025/06/05/cursors-anysphere-nabs-9-9b-valuation-soars-past-500m-arr/)
19. [Devin's 2025 Performance Review: Learnings From 18 Months of Agents At Work - Cognition AI](https://cognition.ai/blog/devin-annual-performance-review-2025)
20. [Devin 2.2: Desktop and Code Review AI Guide - Digital Applied](https://www.digitalapplied.com/blog/devin-2-desktop-code-review-ai-engineer-guide)
21. [Windsurf Review 2026: Codeium AI IDE - Taskade](https://www.taskade.com/blog/windsurf-review)
22. [Goldman Sachs is piloting its first autonomous coder in major AI milestone for Wall Street - CNBC](https://www.cnbc.com/2025/07/11/goldman-sachs-autonomous-coder-pilot-marks-major-ai-milestone.html)
23. [Meet Devin the AI Software Engineer, Employee #1 in Goldman Sachs' Hybrid Workforce - IBM Think](https://www.ibm.com/think/news/goldman-sachs-first-ai-employee-devin)
24. [AI Coder Devin Aims to Modernize Government Systems - Bloomberg Law](https://news.bloomberglaw.com/artificial-intelligence/ai-software-engineer-devin-rolled-out-for-government-use)
25. [Report: Cognition Business Breakdown and Founding Story - Contrary Research](https://research.contrary.com/company/cognition)
26. [Introducing SWE-1.5: Our Fast Agent Model - Cognition AI](https://cognition.ai/blog/swe-1-5)
27. [Announcing Devin Agent Preview with Sonnet 4.5 - Cognition AI](https://cognition.ai/blog/devin-agent-preview-sonnet-4-5)
28. [Did the makers of Devin AI lie about their capabilities? - Medium](https://machine-learning-made-simple.medium.com/did-the-makers-of-devin-ai-lie-about-their-capabilities-cdfa818d5fc2)
29. [Cognizant and Cognition Partner to Scale Autonomous Software Engineering - Cognizant](https://news.cognizant.com/2026-01-28-Cognizant-and-Cognition-Partner-to-Scale-Autonomous-Software-Engineering-and-Deliver-Business-Value-Across-Enterprise-Operations)
30. [Introducing Cognition for Government - Cognition AI](https://cognition.ai/blog/cognition-for-government)
31. [AI-Assisted Coding in 2026: How GitHub Copilot, Cursor, and Amazon Q Are Reshaping Developer Workflows - Java Code Geeks](https://www.javacodegeeks.com/2025/12/ai-assisted-coding-in-2026-how-github-copilot-cursor-and-amazon-q-are-reshaping-developer-workflows.html)
32. [SWE-bench technical report - Cognition AI](https://cognition.ai/blog/swe-bench-technical-report)
33. [World's 'first AI software engineer' fails 85% of its assigned tasks - Tweaktown](https://www.tweaktown.com/news/102761/worlds-first-ai-software-engineer-fails-85-of-its-assigned-tasks/index.html)
34. [AI Coding Firm Cognition in Funding Talks at $25 Billion Value - Bloomberg](https://www.bloomberg.com/news/articles/2026-04-23/ai-coding-firm-cognition-in-funding-talks-at-25-billion-value)
35. [An Early Preview of SWE-1.6 and Research Update - Cognition AI](https://cognition.ai/blog/swe-1-6-preview)
36. [Citi is rolling out agentic AI to its 40,000 developers - American Banker](https://www.americanbanker.com/news/citi-is-rolling-out-agentic-ai-to-its-40-000-developers)
37. [Introducing SWE-1.6: Improving Model UX - Cognition AI](https://cognition.ai/blog/swe-1-6)
38. [SWE-Bench Coding Agent Leaderboard 2026: Claude vs GPT - Awesome Agents](https://awesomeagents.ai/leaderboards/swe-bench-coding-agent-leaderboard/)
39. [Devin AI Review 2026: The Ultimate Hands-On Benchmark Test of the Autonomous AI Software Engineer - AIToolRanked](https://aitoolranked.com/blog/devin-ai-review)
40. [Citi moves into more secure agentic AI - Axios](https://www.axios.com/2026/04/30/exclusive-citi-moves-into-agentic-ai)
41. [Cognition, creator of the AI software engineer Devin, in talks to raise hundreds of millions at $25B valuation - SiliconANGLE](https://siliconangle.com/2026/04/23/cognition-creator-ai-software-engineer-devin-talks-raise-hundreds-millions-25b-valuation/)
42. [Claude Code vs Cursor vs Devin vs Copilot in 2026 - Data Science Collective on Medium](https://medium.com/data-science-collective/claude-code-vs-cursor-vs-devin-vs-copilot-in-2026-the-comparison-everyone-is-still-getting-wrong-5afd6ceff3e7)
43. [Windsurf Review 2026: Cascade AI After Cognition (Tested) - Taskade](https://www.taskade.com/blog/windsurf-review)
44. [AI coding startup Cognition raises $1B at $25B pre-money valuation - TechCrunch](https://techcrunch.com/2026/05/27/ai-coding-startup-cognition-raises-1b-at-25b-pre-money-valuation/)