The OpenAI Agents SDK is an open-source framework developed by OpenAI for building multi-agent systems and agentic workflows. Released on March 11, 2025, it is the production-ready successor to OpenAI Swarm, an experimental educational framework that OpenAI introduced in October 2024. The SDK provides a lightweight, Python-first (and later TypeScript) toolkit for orchestrating AI agents that can use tools, hand off tasks to other agents, validate inputs and outputs through guardrails, and be monitored through built-in tracing. It is released under the MIT license and hosted on GitHub, where it has accumulated over 20,000 stars as of early 2026.
In October 2024, OpenAI quietly released Swarm, an experimental, educational framework for multi-agent orchestration. Swarm was built on top of OpenAI's Chat Completions API and introduced two core abstractions: routines (sets of instructions and tools for an agent) and handoffs (the ability for one agent to transfer a conversation to another agent). OpenAI explicitly described Swarm as a research and educational tool rather than a production-ready product. Shyamal Anadkat, a researcher at OpenAI, said at the time: "Think of it more like a cookbook. It's experimental code for building simple agents. It's not meant for production and won't be maintained by us."
Despite its experimental status, Swarm attracted significant developer attention. It demonstrated that multi-agent systems could be built with minimal abstraction, an idea that influenced the broader AI agent framework ecosystem.
On March 11, 2025, OpenAI released the Agents SDK as part of a broader announcement that also introduced the Responses API, built-in tools for web search, file search, and computer use, and observability tracing. The Agents SDK replaced Swarm with a production-grade framework, retaining Swarm's core philosophy of lightweight, ergonomic multi-agent orchestration while adding the reliability, tooling, and features needed for real-world deployment.
The initial release was Python-only (package name: openai-agents), with a TypeScript/JavaScript version following in June 2025. Both SDKs support the same core primitives: agents, handoffs, guardrails, and tracing.
At OpenAI's DevDay on October 6, 2025, the company announced AgentKit, a higher-level toolkit built on top of the Agents SDK. While the Agents SDK provides foundational, open-source primitives for building agents, AgentKit adds enterprise-oriented features including a visual Agent Builder, ChatKit (embeddable UI components), a Connector Registry, and evaluation loops. The Agents SDK and AgentKit are complementary: the SDK handles low-level orchestration, while AgentKit provides the tooling to ship and iterate on agents faster in production environments.
As of March 2026, the OpenAI Agents SDK is at version 0.13.0 (Python) with over 75 releases, 20,000+ GitHub stars, and 3,300+ forks. The framework has become one of the most widely adopted open-source AI agent frameworks, alongside LangChain (with LangGraph) and CrewAI. The SDK continues to receive frequent updates, with recent versions adding support for voice agents, sessions, MCP server resource management, and retry settings.
The OpenAI Agents SDK is built around two guiding principles:
The SDK operates through an agent loop: a central execution cycle managed by the Runner class. When an agent is run, the loop repeatedly calls the language model, executes any requested tool calls, processes handoffs to other agents, and checks guardrails until the agent produces a final output. The loop terminates when the LLM produces text output of the expected type without requesting further tool calls.
The Agents SDK is organized around four core primitives: agents, handoffs, guardrails, and tracing.
Agents are the fundamental building blocks. Each agent is an LLM configured with instructions, tools, and optional runtime behaviors such as handoffs, guardrails, and structured outputs.
| Property | Required | Description |
|---|---|---|
name | Yes | A human-readable identifier for the agent |
instructions | Yes | System prompt (static string or dynamic callback function) |
model | No | Which LLM to use (defaults to a configured OpenAI model) |
tools | No | List of tools the agent can invoke |
handoffs | No | List of agents this agent can delegate to |
output_type | No | Pydantic model, dataclass, or TypedDict for structured output |
model_settings | No | Temperature, top_p, and other model parameters |
input_guardrails | No | Guardrails to validate user input |
output_guardrails | No | Guardrails to validate agent output |
A minimal agent example in Python:
from agents import Agent, Runner
agent = Agent(
name="Assistant",
instructions="You are a helpful assistant."
)
result = Runner.run_sync(agent, "Write a haiku about recursion.")
print(result.final_output)
Agents support two multi-agent patterns:
Agents also have lifecycle hooks (on_agent_start, on_agent_end, on_llm_start, on_tool_end, on_handoff) that allow developers to observe and react to events during execution. The clone() method allows creating agent variants with modified properties.
Handoffs allow one agent to delegate a task to another agent. They are the primary mechanism for building multi-agent systems where different agents specialize in different domains. Internally, handoffs are represented as tools exposed to the LLM. For example, if there is a handoff to an agent named "Refund Agent," the model sees a tool called transfer_to_refund_agent.
Handoffs accept several configuration options:
| Option | Description |
|---|---|
agent | The target agent to hand off to |
tool_name_override | Custom name for the handoff tool |
tool_description_override | Custom description for the handoff tool |
on_handoff | Callback function executed during the handoff |
input_type | Pydantic schema for structured metadata the model must provide |
input_filter | Modifies conversation history passed to the next agent |
is_enabled | Boolean or function controlling whether the handoff is available |
A triage agent that can hand off to specialized agents:
from agents import Agent, handoff
billing_agent = Agent(name="Billing agent", instructions="Handle billing questions.")
refund_agent = Agent(name="Refund agent", instructions="Process refund requests.")
triage_agent = Agent(
name="Triage agent",
instructions="Route customer queries to the right specialist.",
handoffs=[billing_agent, handoff(refund_agent)]
)
The input_type parameter allows the model to pass structured metadata during a handoff. This is useful for passing small pieces of information like the reason for escalation, the customer's language, or a priority level. Input filters (available through agents.extensions.handoff_filters) can modify or trim conversation history before passing it to the next agent, which is important for managing context length in long conversations.
Handoffs remain within a single run. Input guardrails only apply to the first agent in the chain, and output guardrails only apply to the final agent.
Guardrails provide input and output validation to constrain agent behavior and prevent unsafe or malformed responses. The SDK supports three types of guardrails:
Input guardrails validate user input before or during agent execution. By default, they run in parallel with the agent (for lower latency), but can be configured to run in blocking mode (completing before the agent starts). When an input guardrail detects a violation, it triggers a tripwire that raises an InputGuardrailTripwireTriggered exception and halts execution.
Output guardrails validate the final output of an agent. They always run after the agent completes and apply only to the last agent in a handoff chain. They follow the same tripwire mechanism as input guardrails.
Tool guardrails wrap function tool invocations. Input tool guardrails run before the tool executes and can skip the call or reject the output. Output tool guardrails run after execution and can replace results or trigger tripwires.
Example of an input guardrail:
from agents import Agent, GuardrailFunctionOutput, input_guardrail, Runner
@input_guardrail
async def math_homework_guardrail(ctx, agent, input):
result = await Runner.run(guardrail_agent, input, context=ctx.context)
return GuardrailFunctionOutput(
output_info=result.final_output,
tripwire_triggered=result.final_output.is_math_homework
)
agent = Agent(
name="Support Agent",
instructions="Help customers with their questions.",
input_guardrails=[math_homework_guardrail]
)
The SDK includes built-in tracing that automatically records events during an agent run. Traces capture LLM generations, tool calls, handoffs, guardrail evaluations, speech-to-text transcriptions, and custom events.
Tracing is enabled by default. Each Runner.run() invocation is wrapped in a trace named "Agent workflow" by default. Traces contain spans, which represent individual operations with timestamps and parent relationships. Span types include agent, generation, function, guardrail, handoff, and custom.
Traces are displayed in the OpenAI Traces dashboard, where developers can debug, visualize, and monitor workflows during development and in production. The tracing system also supports sensitive data controls: generation and function spans can optionally exclude LLM inputs/outputs and function call details.
The tracing system integrates with over 20 third-party observability platforms, including Weights & Biases, Arize Phoenix, MLflow, Langfuse, and LangSmith. Non-OpenAI models can use a free OpenAI API key for dashboard access through set_tracing_export_api_key().
Tracing can be disabled globally via the environment variable OPENAI_AGENTS_DISABLE_TRACING=1, programmatically with set_tracing_disabled(True), or per-run using RunConfig.tracing_disabled.
Agents in the SDK can use several types of tools:
| Tool type | Description |
|---|---|
FunctionTool | Any Python function wrapped with the @function_tool decorator; schemas are auto-generated from type hints |
WebSearchTool | Built-in hosted tool for web search (OpenAI models only) |
FileSearchTool | Built-in hosted tool for searching uploaded documents (OpenAI models only) |
ComputerTool | Built-in tool for interacting with computer interfaces |
CodeInterpreterTool | Hosted tool for executing code in a sandboxed environment |
ImageGenerationTool | Hosted tool for generating images |
HostedMCPTool | Tools hosted on external MCP servers |
LocalShellTool | OpenAI-hosted container execution for shell commands |
Function tools are the most common custom tool type. The @function_tool decorator automatically generates a JSON schema from the function's type hints and docstring:
from agents import Agent, function_tool
@function_tool
def get_weather(city: str) -> str:
"""Returns weather info for the specified city."""
return f"The weather in {city} is sunny."
agent = Agent(
name="Weather Agent",
instructions="Provide weather information.",
tools=[get_weather]
)
The SDK also supports the Model Context Protocol (MCP) for connecting agents to external tool providers. Three types of MCP server implementations are supported: hosted MCP server tools, Streamable HTTP MCP servers (local or remote), and Stdio MCP servers.
The Runner class provides three methods for executing agents:
| Method | Description |
|---|---|
Runner.run() | Asynchronous execution; returns a RunResult when complete |
Runner.run_sync() | Synchronous wrapper around Runner.run() for simpler scripts |
Runner.run_streamed() | Streams events in real time; returns a RunResultStreaming |
The agent loop works as follows:
output_type with no tool calls), the loop terminates.Streaming mode (Runner.run_streamed()) emits StreamEvent objects in real time. Two event types are available: RawResponsesStreamEvent (raw events from the LLM in OpenAI Responses API format, such as response.output_text.delta) and RunItemStreamEvent (higher-level events like tool_called, tool_output, and message_output_created).
The Agents SDK provides a Session object for managing conversation context across multiple runs. Sessions are important because AI agents often operate in long-running, multi-turn interactions where carrying too much context forward leads to distraction and increased costs, while preserving too little causes the agent to lose coherence.
Sessions support context management techniques including trimming (removing older messages) and compression (summarizing prior conversation). Optional extras like [redis] allow persistent session storage for production deployments.
The SDK supports human-in-the-loop approval mechanisms that allow developers to pause tool execution, serialize and store agent state, approve or reject specific tool calls, and resume the agent run. Tools can be configured to require approval by setting a needsApproval option to true or to an async function that returns a boolean.
When a tool invocation requires approval, the SDK pauses the run and returns pending approvals in an interruptions array. Developers can approve or reject each pending item, optionally setting sticky decisions (alwaysApprove or alwaysReject) that persist for the rest of the run.
The SDK supports building voice agents through the RealtimeAgent class, which integrates with OpenAI's Realtime API and the gpt-realtime-1.5 model. Voice agents support low-latency speech-to-speech interactions with automatic interruption detection, context management, guardrails, and handoffs between agents.
The TypeScript SDK automatically selects the appropriate transport: WebRTC for browser environments and WebSockets for server-side applications. Voice agents support the same core primitives (tools, handoffs, guardrails) as text-based agents.
While the SDK is developed by OpenAI and works best with OpenAI models (through both the Responses API and Chat Completions API), it is designed to be provider-agnostic. The SDK supports over 100 alternative language models through LiteLLM integration, including models from Anthropic (Claude), Google (Gemini), DeepSeek, Meta (Llama), Mistral, and others.
There are limitations when using non-OpenAI providers. Hosted tools (web search, file search, code interpreter) are only available with OpenAI models. Tracing dashboard integration may require additional configuration. Not all SDK features work identically across providers.
The SDK requires Python 3.10 or newer (for the Python version) and is installed via pip or uv:
pip install openai-agents
Optional extras include [voice] for voice agent support and [redis] for Redis-backed session persistence. An OPENAI_API_KEY environment variable must be set for use with OpenAI models.
The TypeScript version is available as a separate package (@openai/agents) on npm.
The OpenAI Agents SDK competes with several other AI agent frameworks. Each takes a different approach to agent orchestration.
| Feature | OpenAI Agents SDK | LangChain / LangGraph | CrewAI | Claude Agent SDK |
|---|---|---|---|---|
| Primary abstraction | Handoffs and guardrails | Graph-based state machines | Role-based crews | Hooks and subagents |
| Model support | OpenAI-first, 100+ via LiteLLM | Model-agnostic from the start | Model-agnostic | Claude-first |
| Learning curve | Low (minimal abstractions) | Moderate to high | Low to moderate | Low to moderate |
| Multi-agent pattern | Handoff-based delegation | Graph-based control flow | Role-based collaboration | Hierarchical subagents |
| Built-in tracing | Yes (OpenAI dashboard) | Via LangSmith | Via third-party tools | Via Anthropic console |
| Voice agent support | Yes (RealtimeAgent) | No native support | No native support | No native support |
| MCP support | Yes | Yes (via integrations) | Yes (first-class) | Yes (first-class) |
| Hosted tools | Web search, file search, code interpreter, computer use | No hosted tools | No hosted tools | No hosted tools |
| GitHub stars (approx.) | 20,000+ | 47,000+ (LangChain) | 44,000+ | Newer, growing |
| License | MIT | MIT | MIT | MIT |
When to choose OpenAI Agents SDK: Best for teams already using OpenAI models who want minimal abstractions, built-in hosted tools, voice agent support, and a fast path from prototype to production.
When to choose LangGraph: Best for complex, stateful workflows that require graph-based control flow, built-in checkpointing, and model-agnostic design. Has the largest ecosystem and community.
When to choose CrewAI: Best for role-based multi-agent collaboration where agents need distinct personalities and skillsets. Fastest growing for multi-agent use cases.
When to choose Claude Agent SDK: Best for deep Anthropic Claude integration, hooks-based behavioral control, and leveraging the MCP tool ecosystem with local-first design.
The OpenAI Agents SDK has been adopted across a range of applications:
The OpenAI Agents SDK is designed to work with the Responses API, which OpenAI released on the same day (March 11, 2025). The Responses API combines the simplicity of the Chat Completions API with the tool-use capabilities of the Assistants API, providing first-party support for web search, file search, and computer use as built-in tools.
The Responses API is the server-side API that provides built-in tools and model access, while the Agents SDK is the client-side library that orchestrates multi-agent workflows on top of it. The SDK also works with the Chat Completions API for backwards compatibility.
OpenAI has announced plans to deprecate the Assistants API, with a target sunset date in mid-2026, directing developers to migrate to the Responses API and Agents SDK.
Selected releases from the OpenAI Agents SDK:
| Version | Notable changes |
|---|---|
| 0.1.0 | Initial release; added run_context and agent parameters to MCP server tool listing |
| 0.2.0 | Agent replaced with AgentBase in type signatures |
| 0.3.0 | Migrated to gpt-realtime model GA version |
| 0.4.0 | Dropped support for openai package v1.x; v2.x required |
| 0.5.0 | Added SIP protocol support for RealtimeRunner; Python 3.14 compatibility |
| 0.6.0 | Handoff history packaged into single assistant message |
| 0.7.0 | Nested handoff history disabled by default; default reasoning effort for gpt-5.x changed |
| 0.8.0 | Synchronous function tools run on worker threads; configurable MCP tool failure handling |
| 0.9.0 | Dropped Python 3.9 support; narrowed type hint for Agent.as_tool() |
| 0.10.0 | WebSocket transport for Responses API; new streaming examples |
| 0.11.0 | Tool search support with namespaces; computer use tool reached GA |
| 0.12.0 | Retry settings for model API calls via ModelSettings |
| 0.13.0 | Default Realtime model updated to gpt-realtime-1.5; MCP resource management methods |