Deep Cogito
Last reviewed
Jun 8, 2026
Sources
12 citations
Review status
Source-backed
Revision
v1 · 1,730 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
Jun 8, 2026
Sources
12 citations
Review status
Source-backed
Revision
v1 · 1,730 words
Add missing citations, update stale details, or suggest a clearer explanation.
Deep Cogito is a San Francisco artificial intelligence research lab that develops open-weight large language models under the Cogito name. Its models are "hybrid reasoning" systems: a single model can answer a query directly, like a standard chatbot, or first engage an extended self-reflection mode, like a dedicated reasoning model, with the behavior selectable at inference time [1][2]. The company's central research thesis is a training loop it calls Iterated Distillation and Amplification (IDA), in which a model improves itself by searching and reasoning to reach better answers (amplification) and then folding those reasoning paths back into its own weights (distillation), so that capabilities that once required slow search become fast "intuition." Deep Cogito frames IDA as a scalable path toward general superintelligence [3][4]. The lab emerged from stealth on April 8, 2025, and is backed by a USD 13 million seed round led by Benchmark [1][5].
Deep Cogito (legally Deep Cogito Inc.) positions itself as a frontier research lab whose near-term output is a family of openly released models that compete with the strongest open systems from Meta's Llama, Alibaba's Qwen, and DeepSeek, while its long-term goal is "general superintelligence." The company argues that superintelligence is "fundamentally a tractable machine learning problem" that requires first building a training recipe capable of unbounded self-improvement and then scaling it with compute [3][4]. Rather than keep its work proprietary, Deep Cogito ships its models with downloadable weights on Hugging Face and through inference partners, and publishes its methodology on its research blog [1][3].
The Cogito models are distinctive for two design choices. First, they are hybrid: the same checkpoint can run in a fast "standard" mode or a deliberate "reasoning" mode that exposes visible thinking, avoiding the need to maintain separate reasoning and non-reasoning models [1][2]. Second, the reasoning behavior is shaped by IDA so that, the company reports, the models reach answers using markedly shorter chains of thought than comparable reasoning models, which lowers inference cost [3][6].
Deep Cogito was founded in June 2024 and operated in stealth for roughly ten months before its public launch on April 8, 2025 [1][2]. The two named co-founders are:
The starting description of the founders as generically "ex-Google" is broadly correct but imprecise: both founders came from Google, with Arora as an LLM engineer and Malhotra from DeepMind. The company is a resident of South Park Commons, the San Francisco founder community and early-stage backer, and is headquartered in San Francisco [8]. Arora has said the first Cogito models were built by a small team over a development window of roughly 75 days, underscoring the lab's claim that its approach is unusually compute- and capital-efficient [1].
Deep Cogito's technical identity rests on two connected ideas.
Hybrid reasoning. Each Cogito model can operate as a direct-answer language model or, when prompted to, produce an explicit chain of reasoning before responding. This lets a deployment spend few tokens on easy questions and more tokens on hard ones without swapping models, and the models are tuned for tool or function calling, coding, and agentic use [1][2].
Iterated Distillation and Amplification (IDA). IDA is the company's self-improvement loop and the reason it gives for pursuing superintelligence as a scalable training problem rather than a matter of ever-larger inference-time search. The loop has two repeating steps [3][4]:
Over successive iterations, Deep Cogito argues, the model learns which lines of thinking actually matter and develops a stronger "intuition" for the right trajectory to follow, so that behavior which once required long search becomes available quickly. Arora summarized the payoff this way: "Since the Cogito models develop a better intuition of the trajectory to take while searching at inference time, they have 60% shorter reasoning chains than Deepseek R1" [3][6]. The lab presents IDA as conceptually related to the way systems such as AlphaGo combined search with a learned policy, but applied to general language models and aimed at removing human-written supervision from the improvement loop [3][4].
Deep Cogito has shipped three model generations, all released openly for commercial use [1][3][9].
| Release | Date | Models and sizes | Base families | Notes |
|---|---|---|---|---|
| Cogito v1 preview | Apr 8, 2025 | 3B, 8B, 14B, 32B, 70B (dense) | Llama and Qwen | First hybrid models; smaller sizes on Qwen, larger on Llama [1][2] |
| Cogito v2 preview | Jul 31, 2025 | 70B dense, 109B MoE, 405B dense, 671B MoE | Llama (smaller) and DeepSeek (671B MoE) | Adds very large models; 671B is the flagship [3][9] |
| Cogito v2.1 | Nov 18, 2025 | 671B MoE (37B active) | DeepSeek-derived | Updated flagship, pitched as the best US open-weight LLM [7][10] |
Cogito v1 (April 2025). The first release was a preview family at 3B, 8B, 14B, 32B, and 70B parameters, built by post-training open base models from Llama and Qwen. Deep Cogito claimed each Cogito model outperformed the best openly available model of the same size, including Llama, DeepSeek, and Qwen counterparts, on most standard benchmarks, and that the 70B model with reasoning enabled beat DeepSeek's R1 on math and language evaluations while the same model without reasoning exceeded Meta's Llama 4 Scout on LiveBench [1][2]. As with all benchmark claims here, these are the company's own reported figures. The models were distributed on Hugging Face and Ollama and made available through Together AI and Fireworks AI [1].
Cogito v2 (July 2025). On July 31, 2025, Deep Cogito released four larger hybrid models: a 70B dense model, a 109B mixture-of-experts model, a 405B dense model, and a 671B MoE flagship. The largest model is built on a DeepSeek base, reflected in its Hugging Face identifier deepcogito/cogito-v2-preview-deepseek-671B-MoE, while the smaller models build on Llama [9][11]. Deep Cogito reported that the 671B MoE matched or exceeded DeepSeek v3 and the DeepSeek R1 0528 model on several evaluations and approached closed frontier systems such as OpenAI's o3 and Anthropic's Claude 4 Opus, while using about 60% shorter reasoning chains than DeepSeek R1 [3][6]. The company also stated that the entire set of eight models across v1 and v2 was trained for a combined cost of under USD 3.5 million, a figure it uses to argue that IDA is far cheaper than brute-force scaling [3][6][9].
Cogito v2.1 (November 2025). On November 18, 2025, Deep Cogito released an updated 671B MoE model, Cogito v2.1, with 37 billion parameters active per forward pass. Arora described it as "the best open-weight LLM by a US company," competitive with frontier closed and open models on industry benchmarks while ahead of other US open models [7][10]. Reported scores included 98.57% on MATH-500, 89.47% on AIME 2025, 84.69% on MMLU-Pro, and 77.72% on GPQA Diamond, again using fewer reasoning tokens than similarly capable models [10]. The model is available on Hugging Face and through inference providers including OpenRouter, Fireworks AI, Together AI, Ollama, Baseten, and RunPod [10].
Deep Cogito raised a USD 13 million seed round led by Benchmark, the Silicon Valley venture firm, with the financing reported in early August 2025 [5][12]. South Park Commons, where the company is resident, is also associated with its early backing [8]. The starting description's core funding claim is therefore accurate, though the timing is worth clarifying: although Deep Cogito launched publicly in April 2025, the seed round was disclosed several months later, around August 3, 2025 [5][12]. The company has said it will use the capital to expand its research and development efforts [12].
Deep Cogito is notable as part of a wave of well-funded research labs trying to advance open-weight models rather than concede the frontier to closed providers. Its specific contributions are threefold. First, it pushed hybrid reasoning, where one model toggles between fast answers and deliberate thinking, into very large open models, up to a 671B-parameter mixture-of-experts system released with downloadable weights [3][9]. Second, it advanced a concrete self-improvement recipe, IDA, that reframes reasoning not as ever-longer inference-time search but as something to be distilled into model weights as "intuition," yielding the lab's headline claim of roughly 60% shorter reasoning chains at comparable quality [3][6]. Third, it made an aggressive efficiency argument, training its full model lineup for under USD 3.5 million, that challenges the assumption that frontier-adjacent open models require enormous budgets [3][6].
The lab competes in a crowded open-model landscape that includes DeepSeek, Qwen, Meta's Llama, and other independent groups such as Nous Research and Prime Intellect, as well as the closed frontier labs whose models it benchmarks against [1][3]. Independent verification of Deep Cogito's benchmark and cost claims remains limited, so its self-reported results should be read with that caveat. Even so, the release of openly licensed hybrid-reasoning models at the 671B scale, paired with an explicit "machine intuition" self-improvement thesis aimed at general superintelligence, has made Deep Cogito a closely watched entrant among open-model developers [3][7].