Cognition AI
Last reviewed
May 7, 2026
Sources
21 citations
Review status
Source-backed
Revision
v1 ยท 3,795 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 7, 2026
Sources
21 citations
Review status
Source-backed
Revision
v1 ยท 3,795 words
Add missing citations, update stale details, or suggest a clearer explanation.
Cognition AI is an American artificial intelligence company headquartered in San Francisco, California. Founded in late 2023 by competitive programmers Scott Wu, Steven Hao, and Walden Yan, the company focuses on building autonomous AI software engineering agents. Its flagship product, Devin (AI software engineer), launched in March 2024 as the first publicly announced AI system designed to perform software engineering tasks end-to-end with minimal human supervision. Cognition later acquired Windsurf (software), an agentic integrated development environment, in July 2025. As of April 2026, the company was in talks to raise a new funding round at a $25 billion valuation.
Cognition was founded in late 2023 by Scott Wu, Steven Hao, and Walden Yan, all of whom share an unusual background: each won gold medals at the International Olympiad in Informatics (IOI), the premier global competition in algorithmic problem-solving for secondary students.
Scott Wu, who serves as CEO, was born in Louisiana in 1997 to a Chinese immigrant family. He attended Baton Rouge Magnet High School before enrolling at Harvard University. Wu won three IOI gold medals in 2012, 2013, and 2014, placing first overall in the 2014 competition. He also won the Harvard-MIT Mathematics Tournament (HMMT) in 2014, served as Mathcounts national champion in 2011 representing Louisiana, earned a gold medal with Harvard's team at the International Collegiate Programming Contest (ICPC) in 2016, and placed third at the Google Code Jam in 2021. On Codeforces, the online competitive programming platform, Wu achieved Legendary Grandmaster status with a peak rating above 3,000. He left Harvard without completing his degree. Before Cognition, Wu co-founded Lunchclub, an AI-powered professional networking platform, serving as its CTO from 2017 to 2022. Forbes named him to its 30 Under 30 list in 2020.
Steven Hao, who serves as CTO, graduated from MIT with a degree in computer science and mathematics. He was an IOI gold medalist in 2014, finishing sixth globally. After MIT, he joined Scale AI as one of its earliest senior engineers, working there from 2018 onward before co-founding Cognition.
Walden Yan, who serves as CPO, was an IOI gold medalist in 2020, finishing nineteenth globally. He enrolled at Harvard but left without graduating in 2023. Before Cognition, he worked briefly on Cursor (code editor) at Anysphere and co-founded DeepReason, a web3 security startup.
The three founders knew each other through competitive programming circuits, having crossed paths at IOI events and at Harvard and MIT. They initially explored cryptocurrency-related projects before pivoting to generative AI in late 2022 as models like ChatGPT accelerated interest across Silicon Valley. The company was formally established in late 2023.
The founding team extended well beyond the three co-founders. As of March 2024, Cognition's ten-person team collectively held ten IOI gold medals. Among those early employees was Gennady Korotkevich, widely regarded as the greatest competitive programmer of all time. Korotkevich, known by the handle "tourist" on competitive programming platforms, joined Cognition as a software engineer in December 2024 and announced in July 2025 that Cognition had become "the first AI lab to win a verified gold medal at the IOI" through its AI system ryanbAI, which placed seventh overall in the competition under the same conditions as human contestants.
The concentration of elite competitive programmers was intentional. Wu has argued that the reasoning skills developed through algorithmic competition, specifically the ability to break complex problems into sequences of logical steps and adapt strategies under time pressure, translate directly into the kind of planning and debugging that autonomous software engineering requires.
Cognition operated in stealth through early 2024. A formative internal moment came in late 2023 when an early version of Devin autonomously solved a server configuration problem before the Christmas holiday, which the team cited as evidence the approach was viable.
The company went public on March 12, 2024, with a blog post and demo video announcing Devin. The video, showing Devin completing software tasks from start to finish, accumulated over 30 million views on X by early 2026.
Cognition has raised capital primarily from Founders Fund, Peter Thiel's venture firm, across multiple rounds.
| Round | Date | Amount | Valuation | Lead investor |
|---|---|---|---|---|
| Series A | March 2024 | $21 million | $350 million | Founders Fund |
| Series A2 | April 2024 | $175 million | ~$2 billion | Founders Fund |
| Undisclosed round | March 2025 | Undisclosed | $4 billion | 8VC (Joe Lonsdale) |
| Series C | September 2025 | $400 million | $10.2 billion | Founders Fund |
| Funding talks | April 2026 | Hundreds of millions (reported) | $25 billion | Undisclosed |
The Series A in March 2024 raised $21 million at a $350 million valuation. The round was supported by Patrick Collison and John Collison of Stripe, Elad Gil, and others alongside Founders Fund. One month later, Cognition closed a second tranche of $175 million at approximately $2 billion, making it a unicorn within its first year of public visibility. That April 2024 round was also led by Founders Fund.
In March 2025, a new round led by 8VC, the firm co-founded by Joe Lonsdale, valued Cognition at $4 billion. After the Windsurf acquisition closed in July 2025, the combined company's valuation reached approximately $9.7 billion on secondary markets before the formal Series C closed.
In September 2025, Cognition closed $400 million at a $10.2 billion post-money valuation. Founders Fund led the round again. Returning investors included Lux Capital, 8VC, Neo, Elad Gil, Definition Capital, and Swish VC. New investors included Bain Capital Ventures, Hanabi Capital, and D1 Capital. Total funding raised through that point was approximately $896 million. The company disclosed it had kept total net burn under $20 million since founding, an unusually low figure given the pace of revenue growth and capital raised.
In April 2026, Bloomberg reported that Cognition was in early talks to raise a new round that would value the company at $25 billion, more than doubling the September 2025 valuation. The talks followed continued growth in the vibe coding and autonomous engineering markets, and the company had not finalized terms as of the report date.
Cognition's core product is Devin (AI software engineer), an autonomous AI agent designed to complete software engineering tasks from a natural language description. Unlike code completion tools that assist developers in writing individual lines or functions, Devin is designed to take on multi-step engineering projects: writing code, running tests, debugging failures, browsing documentation, and opening pull requests without continuous human input.
When Devin launched in March 2024, it achieved a score of 13.86% on the SWE-bench benchmark, a dataset of real GitHub issues drawn from popular open-source repositories. That figure far exceeded the previous state of the art of 1.96% and represented the first time any autonomous system had resolved more than a small fraction of real-world software issues without being told which files to look at. Cognition positioned Devin not as a replacement for software engineers but as an autonomous teammate that could handle delegated tasks while engineers focus on higher-level architecture and product decisions.
The original product required a subscription starting at $500 per month and operated through a web interface where developers submitted tasks and reviewed completed pull requests.
For extended coverage of Devin's capabilities, benchmarks, reception, and technical architecture, see Devin (AI software engineer).
In April 2025, Cognition released Devin 2.0 with several major changes. The most visible was a price reduction from $500 to $20 per month for a starter plan, a 96% cut that significantly broadened the accessible user base. According to Cognition's internal benchmarks, Devin 2.0 completed 83% more junior-level development tasks per Agent Compute Unit (ACU) compared to prior versions.
Devin 2.0 introduced an agent-native integrated development environment where users and Devin could work together within the same interface. Developers could edit and refine code directly inside the Devin IDE using familiar shortcuts. The update also added the ability to run multiple Devin instances in parallel, each with its own cloud-based IDE, allowing teams to process concurrent tasks.
Three new features shipped with Devin 2.0. Interactive Planning allowed Devin to analyze a codebase and generate a preliminary plan within seconds, which the user could review and adjust before autonomous execution began. Devin Search gave users a way to ask questions about their codebase and receive detailed, cited answers through agentic exploration, with a Deep Mode option for more complex queries. Devin Wiki automatically indexed repositories into a structured wiki containing architecture diagrams, source links, and generated documentation, refreshed every few hours as the codebase changed.
Devin 2.1 added a confidence scoring system. At the start of each session, after planning, and during code-related queries, Devin shows a green, yellow, or red confidence indicator expressing how certain it is that it can complete the task successfully. Sessions with green confidence scores produced pull requests that merged at roughly twice the rate of sessions with red scores. When confidence is low, Devin asks clarifying questions before proceeding.
The 2.1 update also integrated Devin's codebase understanding technology directly into the task workflow. Users can query the codebase mid-session using the "!ask" command or allow Devin to automatically scan for context when it detects a gap in its understanding. The same underlying technology also appeared in DeepWiki, a separate product Cognition released for generating documentation wikis from public repositories.
Through the Linear and Jira integrations, teams could request confidence assessments on multiple backlog issues simultaneously without launching full Devin sessions, letting engineering managers prioritize which tasks to delegate based on predicted success likelihood.
Devin 2.2, released in February 2026, introduced desktop testing through computer use. Devin gained full access to its own Linux desktop environment, allowing it to launch and interact with desktop applications as part of testing a pull request. After completing code changes, Devin could run the application on its own desktop, record the screen, and return that recording for human review before the pull request was merged. Desktop testing was enabled by default for new sessions.
The 2.2 update also shipped Devin Review Autofix, a self-review loop in which Devin plans, writes code, reviews its own output, identifies issues, and fixes them before the pull request is opened. The interface was fully rebuilt to unify the development lifecycle from planning through code review in a single view. Startup time improved by a factor of three.
Cognition introduced MultiDevin for enterprise customers, enabling parallelized task execution across multiple agents. In the MultiDevin configuration, one manager agent coordinates up to ten worker agents, each running independently on separate tasks. This allows engineering teams to delegate entire sprints rather than individual tickets.
See Windsurf (software).
In July 2025, Cognition acquired Windsurf, an agentic IDE developed by Codeium. The acquisition came days after Google completed a $2.4 billion licensing deal in which Windsurf's CEO Varun Mohan, co-founder Douglas Chen, and key research staff left to join Google. That departure left Windsurf's product, brand, remaining employees, and intellectual property available.
Cognition president Russell Kaplan described the deal as coming together over a single weekend, with a first call placed after 5 p.m. on a Friday and a definitive agreement signed Monday morning. While terms were not disclosed publicly, later reporting estimated the price at approximately $250 million.
At the time of acquisition, Windsurf had $82 million in annual recurring revenue, more than 350 enterprise customers, and hundreds of thousands of daily active users, with enterprise ARR growing at a rate of roughly double quarter over quarter. All Windsurf employees participated financially in the deal, with vesting cliffs waived for prior work and full acceleration of existing vesting schedules.
Following the acquisition, Scott Wu framed the combination as pairing Devin's autonomous agent capabilities with Windsurf's developer-facing IDE and established enterprise sales infrastructure. The two products addressed different parts of the software development workflow: developers actively coding used Windsurf, while task delegation and autonomous execution ran through Devin. Wu described the combination as "a massive unlock" toward the company's goal of building the future of software engineering.
Within seven weeks of the acquisition closing, combined enterprise ARR grew over 30%. The Windsurf IDE was updated to include full access to the latest Claude models from Anthropic, and Cognition began integrating its proprietary SWE-1.5 model family into the Windsurf environment.
Alongside the Devin product, Cognition developed its own foundation model family. The SWE-1.5 series is optimized for long-horizon coding tasks and operates at approximately 13 times the speed of Claude Sonnet 4.5 on equivalent workloads according to Cognition's internal benchmarks. Supporting the SWE-1.5 models are SWE-grep and SWE-grep-mini, specialized tools for fast parallel repository search, running at roughly 20 times and 4.5 times the speed of Claude Haiku 4.5 respectively. Cognition deployed these models on Cerebras hardware for optimized inference.
In May 2024, Cognition partnered with Microsoft to integrate Devin with Microsoft Azure. The partnership included both infrastructure (Azure as Cognition's cloud provider) and commercial collaboration through Microsoft's enterprise sales and engineering teams. At Microsoft Build 2024, Microsoft CTO Kevin Scott described Devin as "an absolutely amazing tool."
Through the Microsoft relationship, Devin was deployed at enterprise customers including Visma, a Norwegian fintech company with approximately $2 billion in annual revenue undergoing large-scale cloud migration. Visma reported a 50% reduction in migration project costs and developer productivity gains of up to 2x in some scenarios.
Cognition launched Devin commercially in mid-2024. Annual recurring revenue grew from approximately $1 million in September 2024 to $73 million by June 2025, a 73-fold increase in nine months. Following the Windsurf acquisition in July 2025, which brought Windsurf's approximately $82 million ARR, combined company ARR was estimated at roughly $150 to $155 million by mid-2025.
Enterprise customers as of late 2025 included Goldman Sachs, Citigroup, Dell Technologies, Cisco Systems, Ramp, Palantir, Nubank, Mercado Libre, OpenSea, and Curai Health. Devin had merged hundreds of thousands of pull requests across customer deployments by 2025.
One documented case study involved Linktree, the social media link-in-bio platform. In February 2025, Devin authored approximately 300 pull requests in one month, of which roughly 100 were merged. A Nubank case study, developed in partnership with Microsoft, reported an 8-fold improvement in engineering efficiency and 20-fold cost savings on a monolithic codebase refactoring project.
Pricing as of 2026 starts at $20 per month for individual access. A Team Plan at $500 per month includes unlimited seats with usage-based compute billing measured in Agent Compute Units (ACUs). Enterprise pricing is custom and includes VPC deployment, custom fine-tuned Devin instances, dedicated account support, and custom legal and security terms.
The March 2024 announcement drew significant attention across the software industry. The demo video showing Devin completing a full-stack programming task went viral, and industry figures including the Collison brothers and Elad Gil publicly backed the company. Within weeks, the company had secured $175 million in additional funding and a $2 billion valuation.
Criticism of the launch appeared quickly. The YouTube channel Internet of Bugs published a detailed breakdown of the original demo video, arguing that the Upwork task Devin appeared to complete had been misrepresented and that the autonomous completion shown relied on conditions not disclosed in the promotional material. AI researcher Devansh published similar concerns on Medium. These critiques raised questions about whether Cognition had overstated Devin's real-world capabilities during the launch.
An evaluation by Answer.AI, an applied AI research lab, attempted 20 tasks using Devin in late 2024. The evaluation found 14 failures, 3 inconclusive results, and 3 successes, a success rate of 15%. The evaluators noted that Devin's autonomous nature became a liability in some cases, where it continued pursuing unworkable approaches for extended periods rather than recognizing when a task was structurally blocked.
The SWE-bench score of 13.86% was itself questioned in some quarters. While the number represented a genuine improvement over prior systems, critics noted that failing 86% of tasks was a meaningful limitation for a product positioned as an autonomous software engineer, and that the benchmark's curated subset of issues may not reflect the distribution of tasks encountered in production environments.
By 2025, Cognition's public messaging shifted to emphasize Devin as a tool for augmenting engineering teams rather than replacing individual engineers. Internal performance metrics from November 2025 showed a pull request merge rate of approximately 67%, compared to 34% in earlier versions, alongside reported improvements in speed and compute efficiency.
In 2025, Cognition's ryanbAI system competed at the International Olympiad in Informatics under the same conditions as human contestants. According to Gennady Korotkevich, who announced the result, ryanbAI placed seventh overall and earned a gold medal, making Cognition the first AI lab to win a verified IOI gold medal.
Cognition operates in a crowded market for AI coding tools, competing with products from Anthropic, OpenAI, Microsoft, and several well-funded startups including Cursor.
| Devin (Cognition) | Claude Code (Anthropic) | Codex (OpenAI) | GitHub Copilot (Microsoft) | |
|---|---|---|---|---|
| Architecture | Cloud-based autonomous agent | Terminal-based agentic assistant | Cloud-based asynchronous agent | IDE plugin |
| Where code runs | Cognition cloud infrastructure | Local machine (developer's environment) | OpenAI-managed cloud containers | Local environment |
| Primary interface | Web app + IDE (Windsurf) | Terminal / shell | ChatGPT web interface | IDE extension |
| Underlying model | SWE-1.5 (proprietary) | Claude Sonnet 4 / Opus 4 | codex-1 (fine-tuned o3) | GPT-4 family |
| Parallelism | Yes (MultiDevin, up to 10 agents) | Agent Teams (lead + sub-agents) | Independent parallel agents | No |
| Starting price | $20/month | Included with Claude Pro | Included with ChatGPT Plus | $10/month (individual) |
| SWE-bench (representative) | 13.86% (March 2024, original Devin) | 70.3% (Claude Code, verified) | High (o3-based, specific figure varies by report) | N/A |
The most significant structural difference between Devin and terminal-based tools like Claude Code is where execution happens. Claude Code runs in the developer's own environment with direct access to local files, while Devin and Codex (OpenAI) run in cloud-controlled sandboxes, submitting results as pull requests. Developers who need tight control over their local environment tend to prefer terminal-based agents. Teams that want fully asynchronous delegation without monitoring an active shell session tend toward cloud-based agents.
By acquiring Windsurf, Cognition bridged this gap. The Windsurf IDE provides a developer-facing local environment with a native coding experience, while Devin handles task delegation in parallel. This positions Cognition as the only major player offering both an agentic IDE and an autonomous cloud-based software engineering agent within the same company.
Cursor (code editor) from Anysphere, which Walden Yan briefly worked on before founding Cognition, had reached $500 million in ARR and a $29.3 billion valuation by November 2025. Cursor's growth illustrated the scale of demand for AI-native development tools, and Cognition's acquisition of Windsurf was partly a response to that competitive dynamic.
Autonomous agents that run unsupervised on cloud infrastructure create distinct failure modes compared to interactive tools. When Devin misunderstands a task or encounters an unexpected environment state, it can continue working in the wrong direction for extended periods before surfacing an error. Answer.AI's 2024 evaluation documented cases where Devin spent significant time pursuing approaches that were structurally blocked rather than stopping to ask for clarification.
The confidence scoring system introduced in Devin 2.1 addresses part of this problem by having Devin express uncertainty upfront and ask questions before proceeding. But the accuracy of the confidence estimates depends on how well Devin has indexed the relevant codebase, and large or poorly documented repositories remain harder for the system to reason about.
Security and compliance are ongoing concerns for enterprise adoption. Cognition offers VPC deployment to keep customer code inside their own network perimeter, but any cloud-based execution model requires customers to grant an external system significant access to their codebase and infrastructure. Organizations in regulated industries have been slower to adopt autonomous agents for this reason.
The compute costs of running multi-step autonomous agents at scale are substantially higher than those of interactive code completion tools. Usage-based billing through ACUs means that a task that takes longer than expected can consume more compute than anticipated, which has been noted as a friction point in enterprise procurement.
SWE-bench scores across the industry rose rapidly throughout 2024 and 2025, with multiple competing systems eventually surpassing Devin's initial 13.86% benchmark by large margins. Claude Code reached 70.3% on SWE-bench Verified, and newer models from Anthropic, OpenAI, and Google continued to push the benchmark higher. Cognition's SWE-1.5 family closed some of that gap, but SWE-bench performance does not directly predict success on the diverse range of tasks that appear in production engineering work.