Agent Builder (OpenAI AgentKit)
Last reviewed
May 24, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 ยท 3,504 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 24, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 ยท 3,504 words
Add missing citations, update stale details, or suggest a clearer explanation.
Agent Builder is a visual, browser-based workflow editor developed by openai for designing, versioning, and deploying multi-agent applications on top of the company's API platform. It was introduced on October 6, 2025 at DevDay 2025 in San Francisco as the centerpiece of agentkit, an umbrella product family that also includes ChatKit, the Connector Registry, agent-specific evaluations, and the openai agents sdk.[1][2] The editor presents developers with a drag-and-drop canvas of typed nodes for model calls, tool invocations, conditional branches, loops, guardrails, and human approvals, then publishes the resulting graph as a versioned workflow that can be embedded into a product through ChatKit or invoked from custom code with the Agents SDK.[3][4] Sam Altman framed AgentKit as "a complete set of building blocks" intended to move agents "from prototype to production" with less friction than hand-rolled orchestration code.[5] Agent Builder launched in beta, while ChatKit and the upgraded Evals product reached general availability the same day.[1]
Agent Builder is the latest in a series of OpenAI products aimed at developers who want to build agentic applications that combine model calls with tools, memory, and control flow. In November 2023, OpenAI introduced the openai assistants api at its first DevDay, which exposed threads, runs, and built-in tools such as code interpreter and retrieval for stateful chat assistants.[6] In March 2025, OpenAI superseded that approach with the openai responses api and the open-source openai agents sdk, announced in a post titled "New tools for building agents."[7] The Responses API combined the simplicity of the Chat Completions endpoint with the tool-use capabilities of the Assistants API, and the Agents SDK provided Python and TypeScript primitives for agents, handoffs, sessions, guardrails, and tracing.[7] AgentKit is positioned as a layer above both, with the Responses API as the runtime that executes individual model and tool calls and the Agents SDK as the code-level abstraction that Agent Builder workflows can export to.[3][8]
OpenAI's third DevDay was held at Fort Mason in San Francisco on October 6, 2025 with roughly 1,500 to 2,000 developers in attendance and a $650 in-person registration fee.[9][10] Sam Altman opened the keynote, and Greg Brockman delivered other segments; AgentKit was unveiled in a dedicated portion of the keynote alongside the launch of GPT-5 Pro in the API, the Sora 2 API, the general availability of Codex, lower-cost real-time voice models, and the Apps SDK that lets third-party software render inside openai's ChatGPT.[2][11] Christina Huang, an OpenAI staff member working on the platform experience, built a conference-companion agent live on stage in roughly eight minutes, designing the workflow visually, attaching tools and widgets, previewing it, deploying it through ChatKit, and inviting the audience to use it from the DevDay website.[5][4] The demo was framed as evidence that work which used to take "weeks of complex orchestration" could now happen "in a couple of hours" or less for simple cases.[5]
In the weeks after DevDay, OpenAI added templates, documentation, and a node reference to the developer site.[3][12] HubSpot's stock rose roughly 7% on the day of the announcement after the company was named as one of the first organizations to integrate AgentKit into its Breeze AI assistant, and Albertsons publicly described an internal agent that analyzes sales trends across the grocer's roughly 2,000 stores and 37 million weekly shoppers.[13][14] OpenAI stated on its launch blog that it planned to expose a standalone Workflows API and to make Agent Builder workflows deployable to ChatGPT itself, neither of which were generally available at launch.[1]
AgentKit is not a single product; it is a name OpenAI applies to a bundle of platform features that share an agent-centric design.[1][2] The bundle has five publicly announced components.
The visual canvas for designing workflows. Builds publish versioned objects on the OpenAI platform that can be invoked from ChatKit or the Agents SDK.[3][4] At launch it was in beta.[1]
An embeddable, white-labeled chat front end. ChatKit is implemented as a React widget that can also be loaded into an iframe, and is designed to remain "evergreen" so that new model and platform capabilities surface without front-end rewrites.[4][15] Customers configure a workflow ID and a session token, and ChatKit handles streaming, attachments, citations, custom widgets, and theming.[15] ChatKit reached general availability at DevDay 2025.[1]
An admin-side catalog that lets a Global Admin Console operator centralize and govern the connections between OpenAI products and external data sources, including pre-built connectors for Dropbox, Google Drive, SharePoint, and Microsoft Teams as well as third-party model context protocol servers.[1][15] At launch it entered limited beta for some API, ChatGPT Enterprise, and ChatGPT Edu customers with the Global Admin Console enabled.[1][16]
OpenAI's existing Evals product was extended at DevDay 2025 with trace grading (per-step scoring of decisions, tool calls, and reasoning), dataset support for agent components, automated prompt optimization based on grader annotations, and external-model grading that can call non-OpenAI models through OpenRouter.[1][4] Evals for agents was generally available at launch.[1]
The open-source Python and TypeScript SDK for code-first agent construction with primitives for handoffs, sessions, hosted tools, function tools, model context protocol servers, sandboxes, and tracing.[8] Agent Builder workflows can be exported to Agents SDK code for further customization or self-hosting.[3][4]
A workflow in Agent Builder is a directed graph of nodes connected by typed edges, where each edge carries data of a declared schema from one step to the next.[3] The typing system lets the editor verify at design time that downstream nodes receive the properties they expect, similar to function signatures in a programming language.[3]
The node reference documents nine built-in node types organized into four categories.[12]
| Category | Node | Purpose |
|---|---|---|
| Core | Start | Defines workflow inputs and exposes user input as input_as_text. |
| Core | Agent | Configures instructions, model, tools, and reasoning settings for an LLM step. |
| Core | Note | Inline commentary for documenting workflow logic. |
| Tool | File search | Retrieves data from vector stores using semantic queries. |
| Tool | Guardrails | Sanitizes input and output for PII, jailbreak attempts, and unwanted content. |
| Tool | MCP | Connects to remote model context protocol servers and external tools. |
| Logic | If/else | Conditional routing on Common Expression Language predicates. |
| Logic | While | Looping construct controlled by a CEL condition. |
| Logic | Human approval | Pauses for a human approve or reject decision. |
| Data | Transform | Reshapes outputs to fit downstream schemas. |
| Data | Set state | Writes a global variable readable elsewhere in the run. |
Conditional logic uses Google's Common Expression Language (CEL) rather than free-form natural-language conditions; for example, a router agent might branch on input.output_parsed.classification == "billing_inquiry" to direct a query to a billing-specialist agent.[17][4] OpenAI staff have stated at DevDay that this is a deliberate choice to make routing deterministic and inspectable, rather than relying on a second LLM call to decide control flow.[4]
Agent and tool nodes can call any tool that the Responses API exposes, including web search, the code interpreter sandbox, file search over vector stores, image generation, the OpenAI-hosted computer-use endpoint (related to the openai operator capability), and arbitrary developer-defined function tools.[7][15] The MCP node connects to any HTTP or Streamable HTTP model context protocol server, so external systems such as CRMs, ticketing tools, databases, and document stores can be plugged in without writing OpenAI-specific connectors.[18][15]
The guardrails node and an associated open-source guardrails library wrap inputs and outputs with detectors for PII, prompt-injection attempts, jailbreak patterns, off-topic queries, and configurable content categories.[1][19] Guardrails are configurable at any point in the graph and can short-circuit a run, redact content, or trigger a human-approval branch.[1]
A Human approval node halts execution and presents a structured prompt to a reviewer. At launch the approval surface supported a binary approve or reject decision, and OpenAI has said it plans richer decision types in future updates.[4] Approvals are also exposed through the Agents SDK for code-first workflows.[19]
Publishing a workflow creates an immutable versioned object with its own ID. Developers can roll forward or back between versions without redeploying client code, since ChatKit and the SDK accept a workflow ID.[3] The editor includes a Preview pane that runs the graph against sample inputs, attaching files where relevant, and visualizes which nodes fired, what they received, and what they produced.[3][4]
A Code button in the top navigation exports a workflow to Agents SDK code, in Python or TypeScript, that reproduces the graph using SDK primitives such as Agent, handoff, Guardrail, and tool registrations.[3][8] This export path is the recommended way to deploy workflows that need to run on a developer's own infrastructure or to be customized beyond what the visual editor exposes.[3]
At runtime, ChatKit or a custom application sends a request to the OpenAI platform that references a workflow ID. The platform interprets the graph, calling the Responses API for each Agent node and dispatching tool calls (built-in or MCP) according to the node configuration.[3][4] Each model invocation pays the underlying model's token rates and each tool invocation pays its own per-call or per-storage fees; AgentKit itself does not impose a separate platform charge.[20][15]
State flows through typed edges, while a Set state node can write globals that subsequent nodes can read. Guardrails nodes inspect text passing through their attached point in the graph. Human approval nodes block until an external response arrives, at which point the run resumes from the approved or rejected branch.[12]
The Responses API, which underpins each Agent node, also offers the MCP tool natively in recent models, meaning even agents not built in the visual canvas can call MCP servers and the Connector Registry's curated endpoints.[18]
AgentKit has no separate license fee or subscription. OpenAI's launch blog and pricing documentation state that all components are included with standard API model pricing, with developers paying token and tool fees as they normally would when calling the Responses API.[1][20]
Specific cost components called out by third-party guides and OpenAI's pricing pages include:[20]
| Item | Price |
|---|---|
| GPT-5 (input) | $1.25 per 1M tokens |
| GPT-5 (output) | $10.00 per 1M tokens |
| Code Interpreter | $0.03 per session |
| File search storage | $0.10 per GB per day |
| Web search tool | $10.00 per 1,000 calls |
| ChatKit uploads | $0.10 per GB-day (billing started November 1, 2025) |
The platform offers a 1 GB monthly free tier for ChatKit storage, with paid usage beginning November 1, 2025.[20]
Availability at launch on October 6, 2025 was tiered.[1]
OpenAI highlighted several early adopters in the AgentKit announcement and at DevDay 2025.[1][21]
The expense-management company built a buyer (procurement) agent on Agent Builder. According to OpenAI and to comments cited in coverage of DevDay, Ramp went from "blank canvas to buyer agent in just a few hours," and a customer quote in trade press described iteration cycles cut by 70%, with the agent reaching production in two sprints rather than two quarters.[21][22]
The Swedish fintech is cited as running a support agent that handles roughly two-thirds of customer-service tickets, building on its earlier widely reported deployment of openai-backed chat support.[21]
HubSpot integrated AgentKit into Breeze, its in-product AI assistant. The company's stock rose roughly 7% the day of the DevDay announcement, in part on the AgentKit news.[13]
The U.S. grocery operator uses an internally built agent to analyze sales trends across roughly 2,000 stores and 37 million weekly shoppers, suggesting display or promotion changes when sales of specific categories deviate from forecasts.[14]
Canva built a developer-support agent using ChatKit, embedded into its developer community in less than an hour and reported saving "over two weeks" of front-end engineering work.[21]
OpenAI's launch coverage also mentions sales-prospecting growth at Clay and customer-service deployments at LY Corporation (the Japanese internet conglomerate behind LINE) as early users.[21]
OpenAI states that help.openai.com customer support is itself powered by an AgentKit workflow, providing a public reference of the product in production.[4]
The announcement and accompanying templates emphasize customer support, internal knowledge-base lookup, document discovery, data enrichment, sales prospecting, and procurement as canonical use cases.[1][3] These map onto retrieval-augmented agentic workflow patterns where an agent consults a vector store via the File search node, calls one or more MCP-exposed business systems, and either responds directly or routes to a specialist agent.
Agent Builder enters a crowded market for agent orchestration tools. The closest analogues differ on whether they are visual or code-first, whether they are tied to a single model provider, and how they handle deployment.[23][24]
Google launched its own Vertex AI Agent Builder at Google Cloud Next in April 2024, predating OpenAI's by about 18 months.[25] Google's product centers on no-code conversational agents tied to Gemini and to Google Cloud's search and retrieval infrastructure. In April 2026, Google rebranded Vertex AI Agent Builder as part of the Gemini Enterprise Agent Platform.[26] Vertex AI Agent Builder targets buyers already on Google Cloud, while Agent Builder targets developers on the OpenAI API; both expose a visual interface and a code SDK, but only Agent Builder is fully model-locked to its vendor.
anthropic follows a code-first philosophy. Rather than a visual canvas, Anthropic ships a tool-use API on Claude models and an Agent SDK released alongside Claude 4.6, plus a code-execution sandbox and the anthropic computer use capability that resembles OpenAI's computer-use endpoint.[24] Anthropic does not offer a visual workflow editor comparable to Agent Builder.
langgraph is an open-source Python and TypeScript library from LangChain that models agent workflows as a directed graph with conditional edges, checkpointing, and streaming. It is fully model-agnostic and is widely used for complex multi-agent workflows that need fine-grained control, persistent state, and observability through LangSmith.[23] LangGraph is more flexible than Agent Builder but requires code, and lacks an OpenAI-hosted execution layer.
crewai is a Python framework organized around roles, crews, and processes. It targets teams that want to ship multi-agent systems with a small amount of declarative code and a role-based mental model. Like LangGraph it is model-agnostic and self-hosted, while Agent Builder is OpenAI-hosted and tied to OpenAI models.[23]
Tools such as n8n, Zapier, Make, and Gumloop predate the current agent wave and offer visual canvases for automating SaaS integrations. They have been retrofitted with LLM-powered steps but were not designed around the request-tool-respond loop that defines an agent. Agent Builder differs from these tools by treating the agent (with tools, guardrails, and reasoning) as the canonical node, rather than treating LLMs as just another integration.[22]
| Platform | Vendor | Surface | Model lock-in | Hosting |
|---|---|---|---|---|
| Agent Builder (AgentKit) | openai | Visual canvas plus code export | OpenAI models | OpenAI platform |
| Vertex AI Agent Builder | Visual plus code | Gemini-first | Google Cloud | |
| Anthropic Agent SDK | anthropic | Code | Claude models | Self-host or Anthropic API |
| langgraph | LangChain | Code | None | Self-host (LangSmith for ops) |
| crewai | CrewAI | Code | None | Self-host |
| n8n / Zapier / Make | Various | Visual canvas | None | Hosted or self-host |
Press coverage at launch was broadly positive about the developer experience while noting incomplete coverage of advanced cases. TechCrunch called AgentKit a strategic move by OpenAI to "ship AI agents" in production environments and quoted the platform-spend pricing model as a differentiator.[2] VentureBeat framed Agent Builder as a direct response to Zapier, n8n, and Make as well as to Google's Vertex AI Agent Builder.[21] InfoQ situated AgentKit alongside the broader DevDay 2025 platform story, including GPT-5 Pro and the Apps SDK.[11]
The latent.space interview with OpenAI's Sherwin Wu and Christina Huang the week of DevDay identified specific gaps still on the roadmap: a "bring your own key" mechanism for routing model traffic to non-OpenAI inference, voice modality inside the canvas, and more logical node types for deterministic workflows.[4]
Critics, including the Vellum analysis of the launch, observed that while non-technical users can compose a workflow visually, integrating the result into a production application still requires developer involvement through ChatKit configuration or the Agents SDK, which limits the no-code framing.[27] Independent comparisons in 2026 framework round-ups placed OpenAI's stack alongside langgraph and crewai in terms of production readiness, with Agent Builder offering the strongest visual surface but the heaviest model lock-in.[23]
Public materials and third-party reviews note several constraints on Agent Builder as of its beta release.[4][27]