Imbue

AI Agents AI Companies

21 min read

Updated Jul 16, 2026

Suggest edit History Talk

RawGraph

Last edited

Jul 16, 2026

Fact-checked

In review queue

Sources

22 citations

Revision

v4 · 4,262 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Imbue is a San Francisco artificial intelligence research lab focused on training foundation models and building agent systems oriented toward reasoning and software engineering. The company was co-founded by Kanjun Qiu (chief executive officer) and Josh Albrecht (chief technology officer) under the name Generally Intelligent and rebranded to Imbue in September 2023 in connection with a $200 million Series B round that valued the lab at more than $1 billion.^[1]^[2]^[3] Imbue trains custom large language models, with publicly disclosed work on a 70B-parameter model trained on a 4,088 GPU NVIDIA H100 cluster, and has released a body of open-source infrastructure, evaluation datasets, and a Bayesian Optimization based hyperparameter optimizer named CARBS.^[4]^[5]^[6] Beyond model training, Imbue has shipped developer tools including Sculptor, an interface for running parallel coding agents in isolated containers, and continues to publish research on the theoretical underpinnings of deep learning.^[7]^[8]

Company overview

Item	Detail
Legal name	Imbue, Inc.
Prior name	Generally Intelligent (2021 to September 2023)
Co-founders	Kanjun Qiu (CEO), Josh Albrecht (CTO)
Headquarters	San Francisco, California
Founded	2021
Renamed	September 2023
Series B (initial)	$200 million, announced September 7, 2023
Series B (extension)	Additional $12 million, October 23, 2023
Reported valuation	Over $1 billion (Series B)
Lead investor	Astera Institute
Notable Series B investors	Nvidia, Kyle Vogt, Simon Last, Drew Houston, Notion's Akshay Kothari, Amazon Alexa Fund, Eric Schmidt, Tom Brown
Compute partner	Voltage Park, Dell, Nvidia (4,088 H100 GPU cluster)
Mission focus	Foundation models for reasoning, code, and agents
Open-source releases	cluster-health, carbs, sculptor, evaluation datasets

History

Origins as Generally Intelligent

The company was founded in 2021 by Kanjun Qiu and Josh Albrecht, who had previously collaborated on Sourceress, an AI agent adjacent machine-learning recruiting startup that participated in Y Combinator and raised approximately $13 million in venture capital before winding down operations.^[9] Earlier still, Qiu and Albrecht co-founded Ember Hardware, which worked on laser-based projection displays intended for virtual reality.^[10] Qiu's professional background also included a stint as the first chief of staff at Dropbox during a period of rapid scaling from roughly 200 to 1,200 employees, an experience she has described as formative for thinking about institutional design and operational coordination. She studied computer science at MIT, where she also worked as a graduate researcher at the MIT Media Lab and supported her tuition by writing high-frequency trading algorithms. Albrecht had previously co-founded BitBlinder, a privacy-focused peer-to-peer tool, and CloudFab, a 3D printing services company; the two founders have also co-hosted the Generally Intelligent research podcast since late 2020 and jointly run Outset Capital, a small early-stage investment vehicle for technical founders.^[10]^[11]

Generally Intelligent emerged from stealth on October 20, 2022 with an initial financing of approximately $20 million plus more than $100 million in optioned drawdown commitments structured as a multi-year capital agreement. The company described its mission as developing "generally capable agents with human-like intelligence" that could be safely deployed in the real world. Investors disclosed at the time included Tom Brown, the former GPT-3 engineering lead at OpenAI; Jonas Schneider, the former OpenAI robotics lead; Dropbox co-founders Drew Houston and Arash Ferdowsi; the Astera Institute, a nonprofit research funder associated with Jed McCaleb; and Tim Hanson, a founding member of the Neuralink team who joined the Generally Intelligent board.^[10] Initial public coverage emphasized the lab's deliberate avoidance of immediate productization in favor of long-horizon research on what the founders described as the cognitive components missing from contemporary models.^[10]

Before the rebrand, Generally Intelligent was best known publicly for two assets. The first was the Generally Intelligent podcast, which Qiu and Albrecht launched in December 2020 as a long-form interview show "by deep learning researchers, for deep learning researchers," featuring guests including Percy Liang of Stanford, Chelsea Finn of Stanford and Google, Tri Dao of Princeton, Jamie Simon of UC Berkeley, and Rylan Schaeffer of Stanford; the show has been widely cited within the machine learning community as one of the more technically detailed interview programs covering training dynamics and theory.^[11] The second was Avalon, a procedurally generated 3D reinforcement learning benchmark presented at NeurIPS 2022 in the Datasets and Benchmarks track, where embodied agents face survival tasks such as climbing, jumping, hunting, and foraging in worlds that share a common reward function and action space across 20 task families.^[12]^[13] The Avalon paper, with Albrecht and Qiu as senior authors, framed the benchmark as a way to study sample efficiency and generalization for reinforcement learning in environments resembling the kinds of physical and ecological challenges that shaped the evolution of biological intelligence.^[12]

Rebrand and Series B

On September 7, 2023, Generally Intelligent announced a $200 million Series B funding round and a corresponding rebrand to Imbue.^[1]^[2]^[3] The round was led by the Astera Institute, with participation from Nvidia, Cruise co-founder and former chief executive Kyle Vogt, and Notion co-founder Simon Last; additional Series B participants disclosed in later coverage included Notion co-founder and chief operating officer Akshay Kothari and previously involved angel investors.^[1]^[3] The round took total funding to approximately $220 million and assigned a valuation in excess of $1 billion, placing Imbue among the AI "unicorns" of 2023 alongside Adept AI, Inflection AI, and other applied agent companies.^[2]^[3]

In a blog post accompanying the announcement, Qiu wrote that the new name reflected a desire to focus on "imbuing computers with the ability to reason," and reiterated that the company would not seek to compete with frontier model providers on raw scale but would instead train smaller specialized models around reasoning capability for agents.^[1] TechCrunch coverage described the rebrand as a deliberate departure from the more research-oriented Generally Intelligent identity, with Imbue framing its commercial vision around AI systems that could write, debug, and operate software autonomously on behalf of end users.^[1]

On October 23, 2023, Imbue disclosed an extension to the Series B in which the Amazon Alexa Fund and former Google chief executive Eric Schmidt contributed an additional $12 million, bringing the total Series B above $210 million.^[14] The extension did not change the headline valuation but signaled additional industry interest from cloud and former big-tech principals.

Compute build-out and 70B project (2023 to 2024)

In parallel with the funding announcement, Imbue disclosed a multi-year partnership with Dell Technologies to deploy a high-performance computing system valued at approximately $150 million, built around NVIDIA H100 GPU accelerators.^[4] Reporting at the time and Imbue's later technical writeups confirmed that the cluster contained approximately 10,000 H100 GPUs in aggregate capacity, with the lab's primary training partition consisting of 4,088 H100 GPUs spread across 511 hosts (eight GPUs per node) connected by a three-tier InfiniBand fabric.^[4]^[5] Hosting and physical infrastructure were provided by Voltage Park, a bare-metal H100 provider associated with Astera Institute, with additional cooperation from Dell and Nvidia on hardware and firmware.^[5]^[15]

Across late 2023 and 2024, Imbue used the cluster to pre-train a 70B-parameter Large Language Model from scratch on roughly two trillion tokens, with the goal of producing a base model fine-tuned for multi-step reasoning and code understanding.^[6] On June 25, 2024 the lab released a four-part technical series describing the project, alongside several open-source repositories, and Imbue CTO Josh Albrecht discussed the work in a Latent Space podcast episode co-hosted by Swyx and Databricks chief AI scientist Jonathan Frankle.^[15]^[16]

Product evolution (2024 to 2026)

Following the 70B project, Imbue's public output shifted toward developer tooling. In April 2025 the company released Sculptor, a research preview of a desktop application that orchestrates multiple parallel coding agents, each running in its own container with an isolated copy of the working repository.^[7]^[17] Sculptor was re-released as a more polished application on September 26, 2025 after a ground-up rebuild that integrated Claude Code for the underlying agent reasoning and introduced a "Pairing Mode" that syncs container state to the developer's local environment.^[7]^[17] By 2026 Imbue's product lineup also included mngr, a parallel agent runner; Bouncer, a Twitter (X) feed filtering tool; and several smaller open-source utilities (Blueprint, Vet, Latchkey) published on the company's GitHub organizations.^[8] In early 2026 the lab also published research blog posts titled "Beating ARC-AGI-2 with Code Evolution" and "There Will Be a Scientific Theory of Deep Learning," signaling a continued investment in fundamental research alongside applied work on agents.^[8]

Throughout 2025 and 2026, Imbue's company-facing communications also emphasized policy positions, including the role of individual developers in maintaining bargaining power against centralized AI providers; the company's stated philosophy that "your tools should work for you, not the company that built them" was reflected in choices such as bring-your-own Claude API keys for Sculptor and a preference for MIT-licensed tooling.^[7]^[8]

Mission and research thesis

Imbue describes its mission as building "AI that works for humans, not the company that built them," with a stated emphasis on amplifying individual human agency rather than wholesale automation.^[8] The technical thesis behind the company is that reasoning, not raw scale or knowledge retrieval, is the dominant bottleneck preventing today's AI agents from completing long-horizon tasks reliably.

In a 2023 conversation with the Latent Space podcast, Qiu argued that contemporary large language models lack explicit reasoning data during pre-training and that capable agents will require optimization of reasoning skills as a first-class objective rather than a fine-tuning afterthought.^[18] The Imbue team has framed this as a "full-stack" research agenda spanning four threads:^[8]

Foundation models specialized for reasoning rather than general dialogue.
Experimental agents that exercise those models on realistic tasks (originally embodied agents in Avalon; later, code-writing agents).
Tools and infrastructure to make agents reliable in production environments.
Theory, including projects on scaling laws, hyperparameter prediction, and the empirical science of training stability.

This thesis has shaped a deliberate decision not to pursue a frontier generalist model along the lines of GPT-4 or Claude, but instead to train smaller, specialized models that can be combined with explicit reasoning scaffolds, evaluation harnesses, and product surfaces tailored to specific workflows such as code editing.^[1]^[18] In several public talks, Qiu has contrasted this approach with the "more data plus more parameters" strategy of frontier laboratories, arguing that scaling alone is unlikely to close the gap between current chatbots and reliable long-horizon agents without changes to how reasoning is trained and evaluated.^[18]

Infrastructure and the 70B project

The most substantial public artifact of Imbue's work is a series of four technical posts published in mid-2024 describing the engineering required to train a 70B-parameter model from bare metal on a fresh H100 cluster.^[5]^[6]^[19] The series, written by "about a dozen engineers and researchers" at the lab, covered hardware bring-up, evaluations, hyperparameter optimization, and an overview of the resulting model.^[6]

Cluster topology

The primary training cluster consisted of 4,088 NVIDIA H100 GPU accelerators distributed across 511 host machines, with eight H100 cards per host.^[5] Each GPU was connected to a Mellanox ConnectX-7 InfiniBand network interface card capable of 400 gigabits per second of simultaneous send and receive throughput, with the interconnect organized as a three-tier "fully non-blocking" fat-tree fabric using roughly 320 InfiniBand switches and approximately 12,000 cables.^[5] A separate 100 gigabit per second Ethernet network was used for dataset and checkpoint traffic, and a third management Ethernet network exposed BIOS, baseboard management controller (BMC), and power supply controls for cluster-wide automation.^[5]

The cluster was hosted at Voltage Park facilities, with Dell supplying server and rack hardware and Nvidia providing optimization assistance on NCCL (NVIDIA Collective Communications Library) and InfiniBand UFM management.^[5]^[15]

Bring-up and cluster-health tooling

The bare-metal bring-up phase was the focus of one of the most widely circulated entries in the series, which Imbue paired with the open-source release of the imbue-team/cluster-health repository under the MIT License.^[5]^[20] The repository collects scripts and utilities developed during commissioning, including:^[20]

A GPU stress-test harness that validates tensor allocation and standard operations on every device.
Host-validation scripts that probe NVLink (intra-host) and InfiniBand (inter-host) connectivity by running NCCL collectives across pairs and groups of GPUs.
Automated health checks that flag unhealthy hosts and apply remediations to recurring failure modes.
A UFM event-log parser that identifies misbehaving InfiniBand ports for deactivation.
An InfiniBand "burn-in" workload (ib_burn) that exercises every link in the fabric simultaneously.
A fork of NCCL with additional logging to expose the underlying causes of collective operation hangs and stalls.

The blog post accompanying the release reported that the team observed roughly 3% of hosts failing per week during initial training, and that garbage-collection desynchronization across distributed workers was a persistent source of throughput regression.^[15] Imbue argued that direct partnerships with hardware vendors (rather than running on standard cloud images) had been load-bearing for reducing mean time to repair.^[15]^[16]

CARBS hyperparameter optimizer

A second open-source release was the cost-aware hyperparameter tuning algorithm CARBS (Cost-Aware Pareto-Region Bayesian Search), available at imbue-ai/carbs under the MIT License and described in a research paper.^[21]^[22] CARBS models both task performance and a cost metric (typically GPU-hours) as Gaussian Processes and performs Bayesian Optimization in the neighborhood of the empirically observed performance-cost Pareto frontier, rather than over a fixed global search space.^[22] As a side effect of this local search, the algorithm produces a learned scaling relationship for every hyperparameter as a function of compute, which can be extrapolated to predict tuned values at larger model sizes.^[22]

Reported results include:

Reproducing the Chinchilla scaling laws (α ≈ 0.5, β ≈ 1.0) across 340 training runs while simultaneously tuning 19 hyperparameters, using less aggregate compute than the original DeepMind study, which required training 50 models up to 16 billion parameters.^[22]
On OpenAI's ProcGen reinforcement learning suite, achieving the same task return as a previous state-of-the-art tuned PPO baseline using approximately one quarter of the compute, and exceeding the prior state of the art by more than 16% on the full benchmark.^[22]

CARBS was the optimizer Imbue used to extrapolate hyperparameter choices from smaller proxy models up to the 70B scale, allowing the team to proceed to full-scale training without an exhaustive search at the target size.^[19]^[22]

The 70B model and evaluations

Imbue pre-trained its custom 70B-parameter base model on approximately two trillion tokens, then fine-tuned it on a suite of reasoning and code-comprehension tasks.^[6] The reported architecture uses group-query attention, SwiGLU activations, RMS normalization, and a custom tokenizer, with the base model design borrowing structural choices from the Llama 3 family.^[6] On a battery of multiple-choice reasoning benchmarks (drawn from cleaned versions of 11 public NLP evaluations plus an internal code-comprehension set), Imbue reported that its fine-tuned 70B model outperformed zero-shot GPT-4o on the multiple-choice sub-tasks and approached the performance of a fine-tuned Llama 3 70B model trained on more than seven times as much pre-training data.^[6]

Crucially, Imbue described the training run as proceeding with "minimal training instability and no loss spikes," a result the team attributed to CARBS-derived hyperparameter selection rather than reactive curve fixes during training.^[6] In the public discussion of the project, Albrecht highlighted that this combination of cluster-health automation and CARBS-derived hyperparameters meant the team could run an unusually high-stakes single training job (a multi-week run at full cluster utilization) without the kind of multi-restart history that has been common in other open accounts of training pipelines at the same scale.^[15]^[16]

Alongside the model results, the lab released:^[6]

Cleaned, deduplicated versions of 11 widely used reasoning benchmarks, repaired for ambiguity and label errors.
A new code-focused multiple-choice reasoning dataset, CodeComprehension.
A dataset of roughly 450,000 human quality judgments collected during evaluation cleanup.
A separate fine-tuned 70B "question quality" model used internally as a filter.

The weights of the base 70B model itself were not openly released; Imbue framed the released datasets, scripts, and CARBS as the publicly useful artifacts of the project.^[6]^[15]

Products

Sculptor

Sculptor is Imbue's developer-facing product as of 2026, described as "the missing UI for parallel coding agents."^[7]^[17] The application is a desktop client (initially available on macOS Apple Silicon and Linux) that allows a developer to spawn multiple AI coding agents in parallel, each running inside its own container with a copy-on-write clone of the working repository and an isolated dependency environment.^[7]^[17] Each agent can install packages, run tests, and execute arbitrary code without affecting the developer's machine, mitigating a category of safety issues associated with permission-skipping modes of long-running coding agents.

Notable Sculptor capabilities include:

Pairing Mode, which synchronizes a chosen agent's container into the local repository for hands-on inspection and testing in the developer's preferred IDE.^[7]
Session persistence, retaining each agent's plan, tool calls, chat transcript, and resulting diffs so that sessions can be reopened later without re-prompting.^[7]
Conflict detection, which flags potential merge conflicts between parallel agents and can delegate resolution back to an agent.^[7]
Integration with Claude (via Anthropic API keys or a Claude Pro / Max subscription) as the default reasoning backend.^[7]^[17]

Sculptor was released as a research preview in April 2025 and re-released in a rebuilt form on September 26, 2025; through 2026 it has remained free during beta.^[7]^[17] The source code is published on GitHub under imbue-ai/sculptor.^[17]

Earlier and adjacent products

Imbue also publishes several smaller tools through its GitHub organizations, including mngr (a parallel coding-agent runner positioned as a lighter-weight headless version of Sculptor), Bouncer (an X / Twitter feed filtering and "healing" utility), and the experimental Blueprint, Vet, and Latchkey utilities.^[8] Earlier company experiments, including internal demo agents shown in talks, focused on browser-based task execution and natural-language code editing prior to the consolidation around Sculptor as the primary product surface. The lab's product strategy in 2025 and 2026 has been to keep these auxiliary tools open source and small, with Sculptor as the only commercial-scale offering, an approach Qiu has framed as appropriate for a research lab that wants to maintain a long-term reasoning research agenda while still shipping software developers can adopt.^[8]

Significance

Within the wider 2022 to 2026 wave of agent-focused AI startups, Imbue occupies a distinctive position. Unlike most application-layer agent companies, the lab has invested heavily in the underlying training stack: a 4,088 H100 cluster, in-house dataset cleaning, a from-scratch 70B pre-training run, and a custom hyperparameter optimizer. At the same time, unlike frontier general-purpose laboratories such as OpenAI, Anthropic, and Google DeepMind, it has chosen not to release a flagship generalist chat model, instead publishing infrastructure, datasets, and product tooling.^[1]^[6]^[15]

The cluster-health and CARBS releases, in particular, have been cited within the LLM systems community as among the more detailed openly available descriptions of how to bring a 4,000-GPU class H100 cluster from racks to a working training job, including the failure modes and remediation tooling that vendor documentation often omits.^[15]^[16]^[20] The Avalon benchmark, although less widely used than OpenAI Gym derivatives, remains one of the few open 3D procedurally generated environments designed explicitly for reinforcement learning generalization studies.^[12]

Comparison with peer companies

Imbue is frequently grouped with a cluster of mid-2020s AI startups oriented around agents and code, though each peer has taken a different positioning strategy.

Company	Founded	Focus	Series B-or-later valuation	Public model artifacts
Imbue	2021	Reasoning foundation models and agent tools	Over $1 billion (Sept 2023)^[2]	70B base (unreleased); CARBS, cluster-health, evaluation datasets^[6]^[15]
Adept AI	2022	Action transformers for web and desktop UIs; later partial team move to Amazon	(No comparable Series B disclosed)	Fuyu and ACT-1 model families
Inflection AI	2022	Consumer personal AI (Pi assistant); pivoted to enterprise after 2024 Microsoft deal	$4 billion (mid 2023)	Inflection-1, -2, -2.5, -3
Magic.dev	2022	Long-context code generation models	Reported above $1 billion	Internal proprietary models
Cognition AI	2023	Devin software engineer agent and Windsurf-derived IDE	Multi-billion-dollar valuation	Devin agent platform

What differentiates Imbue within this group is the deliberate combination of foundation-model training with developer-tool product work, and a relatively heavy emphasis on releasing infrastructure rather than benchmark-leading model weights.^[1]^[6]

Limitations and criticisms

Several aspects of Imbue's public posture have drawn scrutiny.

First, although the company has disclosed a 70B reasoning-tuned model and shared its evaluation methodology, the model weights themselves have not been released, and independent third-party reproductions of the reported benchmark gains over GPT-4o are limited.^[6] As Imbue does not publish on common public leaderboards, comparison with peer foundation models depends on benchmarks chosen by the lab.

Second, Imbue's headline claim that its 70B fine-tuned model "outperforms GPT-4o zero-shot" applies specifically to a multiple-choice reasoning subset rather than to free-form generation, instruction following, or agentic task completion; this scope has been noted in technical write-ups summarizing the 70B series.^[6]^[15]

Third, the company's compute strategy depends on a single bare-metal cluster operated by Voltage Park, an arrangement that combines tight hardware co-design with concentrated vendor risk relative to multi-cloud peers.^[5]^[15]

Finally, like other private AI laboratories operating with reasoning agents, Imbue has not yet published the kind of detailed system cards or safety evaluations that frontier labs such as Anthropic and OpenAI now produce alongside model releases, in part because most of its model artifacts are internal.^[6]^[15]

Foundation models for code and reasoning. Imbue's reasoning-first orientation parallels other companies focused on specialized models for code, such as Magic.dev's long-context code models, and contrasts with general-purpose frontier laboratories like OpenAI and Anthropic.
Coding agents. Sculptor's parallel-container architecture sits within a broader 2024 to 2026 design space that includes Devin from Cognition AI, Claude Code from Anthropic, OpenAI Codex and its CLI / Cloud variants, and open-source AI coding agents such as Cline.
Reinforcement-learning environments. Avalon contributed to the procedurally generated benchmark tradition alongside OpenAI Gym derivatives and SIMA from DeepMind.
Open-source training infrastructure. The cluster-health stack covers similar ground to vendor tools such as NCCL tests and CUDA benchmarks, focused specifically on H100 / InfiniBand environments.
Hyperparameter optimization and scaling. CARBS extends ideas familiar from Bayesian Optimization and hyperparameter tuning research to a Pareto-frontier setting that interacts naturally with scaling laws and the Chinchilla scaling laws.

References

Kyle Wiggers, "Imbue raises $200M to build AI models that can 'robustly reason'", TechCrunch, 2023-09-07. https://techcrunch.com/2023/09/07/imbue-raises-200m-to-build-ai-models-that-can-robustly-reason/. Accessed 2026-05-20. ↩
Chris Metinko, "AI Lab Imbue Gets $200M From Nvidia, Others; Hits $1B Valuation", Crunchbase News, 2023-09-07. https://news.crunchbase.com/ai-robotics/new-ai-unicorn-imbue-astera-nvidia/. Accessed 2026-05-20. ↩
Bloomberg News, "Imbue Raises $200M Series B, Gets Valuation of Over $1 Billion", Bloomberg, 2023-09-07. https://www.bloomberg.com/news/videos/2023-09-07/imbue-raises-200m-series-b-video. Accessed 2026-05-20. ↩
Dell Technologies Investor Relations, "Imbue to Develop Next-Generation AI Models with $150 Million Dell High Performance Computing System", Dell Technologies, 2023-09-07. https://investors.delltechnologies.com/news-releases/news-release-details/imbue-develop-next-generation-ai-models-150-million-dell-high. Accessed 2026-05-20. ↩
Imbue, "From bare metal to a 70B model: infrastructure set-up and scripts", Imbue Research, 2024-06-25. https://imbue.com/research/70b-infrastructure/. Accessed 2026-05-20. ↩
Imbue, "Training a 70B model from scratch: open-source tools, evaluation datasets, and learnings", Imbue Research, 2024-06-25. https://imbue.com/research/70b-intro/. Accessed 2026-05-20. ↩
Imbue, "Sculptor: the missing UI for parallel coding agents", Imbue Blog, 2025-09-26. https://imbue.com/blog/sculptor-announce. Accessed 2026-05-20. ↩
Imbue, "We build AI that works for humans (company home page)", Imbue, 2026. https://imbue.com/. Accessed 2026-05-20. ↩
Y Combinator, "Sourceress / Generally Intelligent", Y Combinator Companies Directory, 2024. https://www.ycombinator.com/companies/generally-intelligent. Accessed 2026-05-20. ↩
Kyle Wiggers, "Generally Intelligent secures cash from OpenAI vets to build capable AI systems", TechCrunch, 2022-10-20. https://techcrunch.com/2022/10/20/generally-intelligent-secures-cash-from-openai-vets-to-build-capable-ai-systems/. Accessed 2026-05-20. ↩
Imbue, "Generally Intelligent Podcast (episode index)", Imbue, 2024. https://imbue.com/podcast/. Accessed 2026-05-20. ↩
Joshua Albrecht, Abraham J. Fetterman, Bryden Fogelman, Ellie Kitanidis, Bartosz Wróblewski, Nicole Seo, Michael Rosenthal, Maksis Knutins, Zachary Polizzi, James B. Simon, Kanjun Qiu, "Avalon: A Benchmark for RL Generalization Using Procedurally Generated Worlds", arXiv:2210.13417, 2022-10-24. https://arxiv.org/abs/2210.13417. Accessed 2026-05-20. ↩
NeurIPS, "Avalon: A Benchmark for RL Generalization Using Procedurally Generated Worlds (Datasets and Benchmarks Track)", NeurIPS Proceedings, 2022-12. https://proceedings.neurips.cc/paper_files/paper/2022/hash/539f1f7dd156cfe1222b0be83f247d35-Abstract-Datasets_and_Benchmarks.html. Accessed 2026-05-20. ↩
Maginative, "AI Startup Imbue Raises Additional $12 Million, Extending Series B to Over $210 Million", Maginative, 2023-10-23. https://www.maginative.com/article/ai-startup-imbue-raises-additional-12-million-extending-series-b-to-over-210-million-2/. Accessed 2026-05-20. ↩
Alessio Fanelli and Shawn (swyx) Wang, "State of the Art: Training >70B LLMs on 10,000 H100 clusters (with Josh Albrecht and Jonathan Frankle)", Latent Space, 2024-06-25. https://www.latent.space/p/llm-training-2024. Accessed 2026-05-20. ↩
Imbue, "Training >70B LLMs on 10,000 H100 clusters: Josh on the Latent Space podcast", Imbue Talks, 2024-06-25. https://imbue.com/talks/latent-space-70b/. Accessed 2026-05-20. ↩
imbue-ai, "sculptor (GitHub repository)", GitHub, 2025-2026. https://github.com/imbue-ai/sculptor. Accessed 2026-05-20. ↩
Alessio Fanelli and Shawn (swyx) Wang, "Why AI Agents Don't Work (yet), with Kanjun Qiu of Imbue", Latent Space, 2023-09. https://www.latent.space/p/imbue. Accessed 2026-05-20. ↩
Imbue, "Open-sourcing CARBS: how we used our hyperparameter optimizer to scale up to a 70B-parameter language model", Imbue Research, 2024-06-25. https://imbue.com/research/70b-carbs/. Accessed 2026-05-20. ↩
imbue-ai, "cluster-health (GitHub repository)", GitHub, 2024. https://github.com/imbue-ai/cluster-health. Accessed 2026-05-20. ↩
imbue-ai, "carbs (GitHub repository)", GitHub, 2023-2024. https://github.com/imbue-ai/carbs. Accessed 2026-05-20. ↩
Abraham J. Fetterman, Ellie Kitanidis, Josh Albrecht, Zachary Polizzi, Bryden Fogelman, Maksis Knutins, Bartosz Wróblewski, James B. Simon, Kanjun Qiu, "Scaling Laws For Every Hyperparameter Via Cost-Aware HPO (CARBS)", Imbue Research / arXiv:2306.08055, 2023-06. https://imbue.com/research/carbs/. Accessed 2026-05-20. ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

3 revisions by 1 contributor · full history

Suggest edit

What links here

Voltage Park