Factory is an American AI company that builds autonomous software engineering agents for enterprise engineering teams. Founded in April 2023 by Matan Grinberg and Eno Reyes, Factory develops a platform centered on task-specific AI agents called "Droids" that automate work across the full software development lifecycle, from code generation and testing to documentation, incident response, and multi-repository migrations. In April 2026, the company raised a $150 million Series C round led by Khosla Ventures at a post-money valuation of $1.5 billion. Factory's customers include Nvidia, Adobe, Morgan Stanley, Ernst & Young, Palo Alto Networks, Adyen, MongoDB, Bayer, and Zapier.
Matan Grinberg grew up in Israel and attended Princeton University, where he graduated in 2020 with a degree in physics. After Princeton, he began a PhD in theoretical physics at the University of California, Berkeley, working on quantum field theory, general relativity, and string theory. During his time at Berkeley, he became interested in what was then called program synthesis and is now called code generation, taking AI courses alongside his physics work. He eventually concluded that academic physics was not where he wanted to spend his career and began looking for a way to apply his analytical background to AI.
Eno Reyes also attended Princeton, where he wrote a senior thesis on deep learning and conducted research on computational models of human cognition. After Princeton, Reyes worked as a software engineer at Microsoft, where he built anomaly detection systems, and then as a machine learning engineer at Hugging Face, where he helped enterprises research and develop language models.
Grinberg and Reyes knew of each other through Princeton's alumni network but did not meet properly until a LangChain hackathon in San Francisco in early 2023. Despite having approximately 150 mutual friends from their time at Princeton, the two had never been introduced. At the hackathon, they discovered a shared obsession with code generation and autonomous systems. According to Reyes, it was "intellectual love at first sight." They incorporated the company two days after meeting. Reyes left Hugging Face and Grinberg abandoned his Berkeley PhD program, and within eight days both had committed fully to the new venture.
The company was originally incorporated as San Francisco Droid Company. Legal concerns about the Star Wars trademark prompted a name change. The founders settled on Factory, which derives from the equation F(A) = Y, where A represents "actor," a shorthand from machine learning literature for an agent.
Factory launched in April 2023. The founding premise was that the software development lifecycle contained enormous amounts of repetitive, time-consuming work that occupied skilled engineers but did not require genuine creative judgment: code review, documentation updates, test writing, on-call incident triaging, and project tracking. Grinberg and Reyes believed autonomous agents could absorb this work and free engineers to focus on architecture, product decisions, and the parts of engineering that require human insight.
Within six months of formation, Factory had acquired its first enterprise customers, including two publicly traded companies and two decacorns (companies valued at over $10 billion). The company signed these clients before its seed round was publicly announced, which the founders cited as validation of the product thesis.
The company deliberately chose a browser-based and CLI-first platform rather than an IDE integration. Grinberg argued that building around IDE paradigms was analogous to improving a horse-drawn carriage rather than designing a car: the paradigm itself needed to change. In his view, the future of software development would involve less code writing and more planning, validation, and oversight, which required a different kind of interface.
On November 2, 2023, Factory publicly announced a $5 million seed round co-led by Sequoia Capital and Lux Capital, with participation from SV Angel, BoxGroup, and several angel investors including Databricks CEO Ali Ghodsi, Hugging Face co-founder Clement Delangue, and product executive Gokul Rajaram. Shaun Maguire, a partner at Sequoia Capital with a background in physics, led the investment from Sequoia's side. Grinberg had cold-emailed Maguire, knowing of his physics background, and the two had a long conversation before Maguire committed. The seed funding was used to hire the initial team and bring the first Droids to production.
In June 2024, Factory raised a $15 million Series A led by Sequoia Capital, with participation from Lux Capital and Mantis VC. The round valued the company at approximately $120 million. By the time of the Series A, Factory had developed six Droids to production scale: Review, Test, Code, Knowledge, Project, and Document. The Code Droid had achieved a top ranking on SWE-bench, an academic benchmark for AI software engineering, though the company later stopped reporting SWE-bench scores, arguing that the benchmark's Python-only, synthetic task structure did not reflect real enterprise work. The Series A announcement noted that Factory had doubled its customer base month over month in the preceding year.
In September 2025, Factory raised $50 million in a Series B round. The round was led by New Enterprise Associates (NEA) and Sequoia Capital, with participation from Nvidia, J.P. Morgan, Abstract Ventures, and Mantis VC. Notable angels who invested in the round included Frank Slootman (former CEO of Snowflake), Nikesh Arora (CEO of Palo Alto Networks), and Aaron Levie (CEO of Box). The Series B valued Factory at $300 million.
At the time of the Series B, Factory had adopted the name "Droid" for its unified agent, replacing the earlier named-Droid taxonomy. The company reported that Droid had ranked first on Terminal-Bench, a benchmark developed to measure agent performance on realistic command-line tasks, scoring 58.75%. Customers cited in the announcement included EY, MongoDB, Bayer, Zapier, and Clari. The company reported customer-reported outcomes including 31x faster feature delivery, 96% shorter migration times, and 96% reduction in on-call resolution times.
On April 16, 2026, Factory announced a $150 million Series C round led by Khosla Ventures, with participation from Sequoia Capital, Insight Partners, Blackstone, Evantic Capital, Abstract Ventures, 20VC, NEA, and Mantis VC. The round valued Factory at $1.5 billion on a post-money basis, making it a unicorn. Keith Rabois, a managing director at Khosla Ventures who had previously been a partner at Founders Fund, joined Factory's board as part of the investment.
Rabois had publicly articulated a thesis that enterprise AI infrastructure companies solving the integration, compliance, and orchestration problems around autonomous agents represent a more durable category than consumer AI applications, because enterprise buyers evaluate tools based on reliability and integration depth rather than novelty. Factory's focus on large organizations with complex, long-lived codebases aligned with this view.
By the time of the Series C, Factory had doubled revenue month over month for six consecutive months. The company described the preceding two years as "a desert period" of deep technical work on agent orchestration, followed by a dramatic inflection beginning in mid-2025 when enterprise contracts began closing at scale. Factory's total disclosed funding across all rounds reached approximately $220 million.
Factory's primary product is Droid, an autonomous AI software engineering agent designed to operate across the full software development lifecycle. The term was chosen to distinguish Factory's agents from the generic "agent" label, which the founders felt had become overloaded and imprecise. The company was originally incorporated as San Francisco Droid Company, but trademark concerns prompted the name change to Factory while retaining "Droid" for the product.
Droids handle a range of engineering tasks without requiring step-by-step human direction:
| Task category | Description |
|---|---|
| Feature development | Writing code for new features in existing codebases |
| Code review | Reviewing pull requests with contextual understanding of the codebase |
| Documentation | Generating and maintaining technical documentation |
| Test writing | Authoring unit tests, integration tests, and coverage analysis |
| Migrations | Migrating legacy systems to modern stacks, such as Python 2 to Kotlin or COBOL to Java |
| Incident response | On-call triaging, root cause analysis, and remediation |
| Codebase Q&A | Answering technical questions about codebases using Knowledge Droid |
| Refactoring | Modernizing or reorganizing existing code |
Droids operate across multiple interfaces. Developers can use them through the Factory desktop application, the CLI, IDE extensions, Slack, Linear, GitHub, and a browser interface. The platform integrates with common enterprise tools including GitHub, Jira, Sentry, Datadog, PagerDuty, Notion, and Google Drive.
Unlike single-model competitors, Factory's platform is model-agnostic. The platform routes tasks across multiple foundation models depending on the nature of the work. In 2026, Missions used Anthropic's Claude Opus 4.6 for orchestration, Claude Sonnet 4.6 and Claude Opus for implementation, OpenAI's GPT-5.3-Codex for validation, and Kimi K2.5 for research tasks. This routing approach allows Factory to optimize cost and quality independently rather than committing to a single model provider.
In early 2026, Factory introduced Missions, a capability that extends Droid from single-session work to long-horizon, multi-day project execution. With Missions, a user describes a business outcome in natural language (for example, "migrate the billing service off legacy Python 2 and onto our new Kotlin microservice") and approves a plan, then Droid handles decomposition, execution, and validation over hours or days.
Missions use a hierarchical orchestration architecture. An orchestrator agent breaks the goal into milestones, each representing a meaningful checkpoint of progress. Each milestone ends with a validation phase in which workers review accumulated work, run tests, check for regressions, and verify integration. When validation surfaces issues, the orchestrator creates follow-up work before advancing. Within each milestone, individual features are assigned to fresh worker sessions with clean context, preventing context degradation from accumulating over a long run.
Different models handle different roles within a Mission. The orchestrator operates on Claude Opus 4.6 for its planning and decomposition capabilities, while implementation workers use lighter models for cost efficiency. The platform applies targeted rather than broad parallelization, running parallel workers where coordination overhead is low and serializing where workers need to share context.
In practice, the median Mission runs approximately two hours, but 65% run longer than one hour, 37% run longer than four hours, and about 14% exceed 24 hours. The longest recorded Mission ran for 16 days. Missions consume roughly 12 times more tokens than standard Droid sessions at the median. Documented Mission use cases include a COBOL-to-Java migration (completed in 33.8 hours), a greenfield Tauri and React desktop application (30 hours), and production incident debugging.
Known limitations of Missions include occasional over-scoping by the orchestrator, worker struggles with edge cases requiring human judgment, and the accumulation of errors in very long runs despite milestone validation checkpoints.
Factory Desktop is the company's native macOS and Windows application for managing Droids. The desktop app provides a unified interface for running multiple Droid sessions simultaneously, monitoring agent progress in real time, injecting context mid-task, and reviewing completed work. It includes multi-agent orchestration visualization, allowing users to see sub-agents spawned during a Mission and their status. The Factory Desktop was positioned as the primary interface for Missions, since multi-day tasks require more persistent monitoring than CLI or IDE extension workflows typically support.
Factory's underlying Droid architecture relies on sophisticated retrieval rather than loading entire codebases into model context windows. For large enterprise repositories spanning multiple years and millions of lines of code, context windows alone cannot capture sufficient codebase knowledge. Factory built retrieval infrastructure that indexes version control history, documentation, incident logs, and issue trackers, allowing Droids to pull relevant context dynamically.
The platform builds a contextual model of the codebase that the company described as equivalent to the onboarding a new human engineer would receive: understanding conventions, architectural patterns, team practices, and the history of past decisions. This contextual grounding is what Factory argues separates enterprise-grade agents from general-purpose coding assistants that operate without institutional knowledge.
Enterprise deployments support three network topologies: cloud-managed (Factory hosts all infrastructure), hybrid (customer infrastructure for sensitive data, Factory cloud for orchestration), and fully air-gapped (all traffic stays within the customer's network, with models and telemetry collectors hosted entirely on-premise). The air-gapped option was developed for financial services and government customers with strict data residency requirements.
Factory targets large organizations rather than individual developers. The company's thesis is that the highest-value AI software engineering work is not writing greenfield code for solo developers but handling the long-lived, complex, multi-repository codebases that accumulate over decades in large organizations. These codebases contain technical debt, legacy systems, underdocumented components, and interconnected microservices that require deep contextual understanding.
Factory's enterprise governance model uses a hierarchical policy structure. Policies cascade from organization level (global defaults and hard security policies) to project level (repository-specific settings committed to a .factory/ directory) to folder level (subsystem overrides within a repository) to user level (personal preferences where higher levels are silent). Organization and project policies extend downward. Users can opt into stricter controls but cannot weaken organization-level constraints. This hierarchy governs model selection, tool permissions, MCP servers, Droid configurations, autonomy levels, and telemetry destinations.
The platform supports SOC 2, ISO 27001, and ISO 42001 compliance reviews. OTEL-native telemetry and audit logging allow compliance teams to trace agent actions and satisfy regulatory review requirements.
Factory uses a fully usage-based pricing model denominated in standard tokens rather than opaque credits. There is a fixed team access fee, per-user fees, and then per-token usage charges that represent the majority of costs for active accounts. The company argued that transparent token-based pricing attracts customers who understand the economics of AI infrastructure, rather than customers attracted by promotional pricing who churn when costs become visible.
Factory's enterprise customer base as of April 2026 included:
| Customer | Sector | Publicly confirmed |
|---|---|---|
| Nvidia | Semiconductors / AI | Yes |
| Adobe | Software | Yes |
| Morgan Stanley | Financial services | Yes |
| Ernst & Young (EY) | Professional services | Yes |
| Palo Alto Networks | Cybersecurity | Yes |
| Adyen | Payments / Fintech | Yes |
| MongoDB | Database software | Yes |
| Bayer | Life sciences | Yes |
| Zapier | Workflow automation | Yes |
| Clari | Revenue operations software | Yes |
| Bilt Rewards | Fintech | Yes |
The company reported that hundreds of thousands of developers across enterprise accounts used Droids daily as of April 2026. MongoDB CEO Dev Ittycheria publicly endorsed the platform, making it one of the more prominent CEO-level testimonials in the enterprise AI coding market.
Factory operates in a market that includes Cognition AI, Augment Code, Cursor (code editor), GitHub Copilot, and Claude Code. Each takes a different approach to AI-assisted software development.
| Company | Primary product | Target user | Autonomy model | Key differentiator |
|---|---|---|---|---|
| Factory | Droids (autonomous agents) | Enterprise engineering teams | Fully autonomous, long-horizon | Enterprise governance, air-gapped deployment, Missions for multi-day tasks |
| Cognition AI | Devin (AI software engineer) | Enterprise and developer teams | Fully autonomous, cloud sandbox | End-to-end sandboxed environment; acquired Windsurf IDE in 2025 |
| Augment Code | Auggie (AI coding agent) | Large enterprise teams | Human-in-the-loop with autonomous modes | Deep codebase indexing; enterprise security focus |
| Cursor (code editor) | Cursor IDE | Individual developers and teams | Interactive with agent modes | VS Code fork with inline AI; large developer adoption |
| GitHub Copilot | Copilot (IDE assistant) | Individual developers | Suggestion-based | GitHub integration; Microsoft distribution |
Factory and Cognition AI are the closest competitors in terms of product philosophy. Both build fully autonomous agents rather than IDE assistants. Cognition's Devin attracted significant attention in 2024 when it was presented as the first AI software engineer capable of completing end-to-end tasks. Devin operates inside a cloud-hosted sandboxed environment with its own browser and terminal. Factory's Droid operates locally (via Factory Bridge) or in the cloud and integrates directly with the customer's existing development environment rather than requiring a separate sandbox. Cognition acquired Windsurf, a popular AI-powered IDE, in mid-2025, expanding its surface area into the interactive coding market where Cursor had established a strong position.
Augment Code shares Factory's enterprise focus and deep codebase understanding approach. Augment built a proprietary indexing layer to contextualize large codebases, which it positioned against Cursor's reliance on model context windows. Factory's differentiator against Augment is the Missions capability, which handles multi-day autonomous work, and the air-gapped deployment model required by regulated industries.
Cursor became the dominant choice among individual developers and small teams through its VS Code-compatible IDE interface and strong model integrations. By early 2026, Cursor was used across more than half of Fortune 500 companies in at least some teams. However, Cursor's origins as an interactive code editor meant its agent capabilities were layered on top of a human-in-the-loop paradigm, whereas Factory was built from the start for autonomous execution. NEA's investment thesis for Factory's Series B explicitly cited research showing that IDE-native tools demonstrated only 20-40% productivity gains and, in a METR study, actually slowed developers by 19% on bug fixes and refactors in large repositories, while Factory's infrastructure-level approach showed more substantial improvements on long-horizon enterprise tasks.
Factory, Cognition, and Augment Code all compete for the enterprise AI software development budget that had previously been divided between developer tooling, QA automation, and IT consulting. The market was large enough by 2026 that multiple well-funded companies could operate with different positioning without directly cannibalizing each other.
Migration work was the use case that generated the most dramatic customer-reported ROI. One Factory customer reported reducing a four-month migration project to 3.5 days with zero downtime. The Missions feature was designed in part around migration use cases, since they involve decomposable, verifiable work across large codebases with clear success criteria (the migrated system passes its test suite and performs identically to the original).
A COBOL-to-Java migration documented by Factory completed in 33.8 hours. A Python 2 to Kotlin microservice migration was the canonical example used in the Missions product announcement.
The Reliability Droid, which Factory described as unexpectedly popular with enterprise customers, handles on-call incident triaging. When integrated with PagerDuty, Datadog, and Sentry, the Droid receives incident alerts, retrieves relevant codebase context, identifies probable root causes, proposes or implements fixes, and escalates to human engineers when it cannot resolve the issue. Customers reported 96% reduction in on-call resolution times.
Large engineering organizations generate more pull requests than human reviewers can handle promptly. Factory's Review Droid performs code review with codebase context: it understands architectural conventions, identifies regressions relative to the codebase's history, and flags code that conflicts with documented patterns. Unlike general-purpose AI code reviewers that operate on a single diff, the Review Droid can cross-reference the proposed change against multiple related repositories.
Technical documentation is chronically neglected in engineering organizations because it generates no immediate deliverable and falls below product work in priority. Factory's Document Droid generates and maintains documentation automatically as code changes, inferring documentation updates from diffs and keeping knowledge bases current without requiring developer action.
The Test Droid authors tests for code that has been written without adequate coverage. In organizations undergoing migrations or modernizations, this allows teams to add test coverage to legacy code before modifying it, reducing the risk of regressions introduced by subsequent changes.
Factory attracted positive coverage from investors and technology press following each funding round. The Sequoia Capital partnership blog post emphasized the company's speed from founding to first enterprise customers and the founders' technical depth. NEA's investment memo published at Series B cited the 200% quarter-over-quarter growth in 2025 and the Terminal-Bench ranking as evidence of product quality.
CEO Matan Grinberg was quoted in McKinsey's technology interview series, where he described the company's position: "Agents will not replace developers, but developers who are fluent with agents will rapidly outleverage and outpace developers who are not." The framing positioned Factory's product as augmentation rather than replacement, a common positioning across the enterprise AI coding market.
The Latent Space AI podcast featured a detailed technical interview with the founders, which became one of the more-cited discussions of agent-native development architecture in the developer community in 2025.
Following the April 2026 Series C announcement, multiple technology publications noted that Factory's $1.5 billion valuation placed it among a small group of enterprise AI companies that had crossed the unicorn threshold without building or fine-tuning foundation models. The valuation was seen as evidence that the application layer of the AI stack could capture significant value even as underlying model costs continued to fall.
Factory acknowledges several areas where its platform falls short of fully autonomous software engineering.
Long-horizon agentic behavior remains technically immature across the industry. While Missions can run for hours or days, the error accumulation problem grows as session length increases. The orchestrator occasionally over-scopes plans, and individual workers can fail on edge cases that require judgment calls that humans make implicitly. Factory described Missions as "early" in its product documentation and noted that fundamental questions in multi-day agent architecture were still being worked through.
Observability in enterprise environments presents a structural conflict. Enterprise security policies often prohibit sending code or diffs outside the corporate network, which limits the telemetry that Factory can collect. The company acknowledged a gap in "semantic observability": understanding whether agent output matches the subjective quality preferences of the engineering team, beyond binary metrics like test passage. Current tooling supports audit logging and compliance reporting but cannot capture the kind of qualitative feedback that engineers use when reviewing human-written code.
Model training creates tool bias. Models with heavy reinforcement learning post-training (such as Claude Sonnet 3.7 and variants of GPT-5.3-Codex) embed preferences for specific CLI tools, such as favoring grep over more sophisticated retrieval approaches. Factory's agents must sometimes work around these embedded preferences to use superior tools that the model would not naturally reach for, which requires additional prompt engineering and can reduce reliability.
Benchmark transparency has been a point of friction with parts of the developer community. Factory stopped reporting SWE-bench results after initially performing well on the benchmark, arguing that SWE-bench's Python-only, synthetic task structure does not reflect enterprise workloads. Critics noted that competitors including Cursor, Cognition AI, and Anthropic all publish SWE-bench results, and argued that Factory's withdrawal suggested its advantage lies primarily in orchestration and integration rather than raw coding capability. Factory maintained that SWE-bench measures a narrow form of autonomous problem-solving with limited overlap with the migration, governance, and multi-repository work that generates ROI for enterprise customers.