AutoGen is an open-source programming framework developed by Microsoft Research for building AI agents and enabling cooperation among multiple agents to solve tasks. Originally released in October 2023, AutoGen introduced a conversation-driven approach to multi-agent systems, allowing developers to create customizable agents powered by large language models (LLMs), human inputs, and tools that collaborate through structured conversations. The framework became one of the most popular open-source projects in the agentic AI space, accumulating over 54,000 stars on GitHub and millions of downloads.
AutoGen was one of the earliest frameworks to demonstrate the power of multi-agent conversation for solving complex tasks, and its research paper became one of the most cited works in the agentic AI space. The framework underwent a complete rewrite from version 0.2 to version 0.4 in 2024, adopting an event-driven, asynchronous architecture. In October 2025, Microsoft announced that AutoGen would be merged with Semantic Kernel into the new Microsoft Agent Framework, with AutoGen entering maintenance mode.
AutoGen's foundational paper, "AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation," was published as a conference paper at COLM 2024 and received the Best Paper award at the LLM Agents Workshop at ICLR 2024. The framework has since gone through a major architectural redesign with version 0.4 in January 2025, and its concepts were eventually merged into the broader Microsoft Agent Framework announced in October 2025.
AutoGen originated within FLAML (Fast Library for Automated Machine Learning), an open-source AutoML project at Microsoft Research. The initial AutoGen module was created inside FLAML on March 29, 2023, by researchers Qingyun Wu and Chi Wang, along with collaborators from Penn State University, the University of Washington, Stevens Institute of Technology, and other institutions.
The first version of the AutoGen paper (arXiv:2308.08155) appeared on August 16, 2023, describing a framework for building LLM applications through multi-agent conversations. On October 3, 2023, AutoGen was spun off from FLAML into its own standalone repository on GitHub under the microsoft/autogen organization. The project gained rapid traction: within 35 days of its spinoff, AutoGen was selected into the Open100 list of top 100 open-source achievements, and TheSequence named the AutoGen paper one of the five best AI papers of 2023.
The paper was led by researchers including Qingyun Wu, Gagan Bansal, Jieyu Zhang, and Chi Wang. The framework was released as open-source software on GitHub coinciding with a surge of interest in LLM-powered agents following the release of GPT-4 and the growing recognition that single-agent systems had significant limitations for complex, multi-step tasks. The paper has since accumulated over 850 citations on Semantic Scholar, making it one of the most referenced works in the multi-agent AI literature.
The full author list of the foundational paper includes Qingyun Wu, Gagan Bansal, Jieyu Zhang, Yiran Wu, Beibin Li, Erkang Zhu, Li Jiang, Xiaoyun Zhang, Shaokun Zhang, Jiale Liu, Ahmed Hassan Awadallah, Ryen W. White, Doug Burger, and Chi Wang.
The core insight behind AutoGen was that many complex tasks could be decomposed into conversations between specialized agents, each with distinct capabilities and roles. Rather than building a single monolithic agent that tries to handle everything, developers could compose multiple simpler agents that collaborate through natural language dialogue. This conversational approach proved both intuitive for developers and effective for a wide range of applications, from code generation to research synthesis.
In September 2024, some of AutoGen's original creators, including Chi Wang and Qingyun Wu, departed Microsoft. By November 2024, they established a new organization called AG2AI and forked the AutoGen codebase into a project called AG2 (formerly AutoGen). The founders cited two main reasons for the split: the desire to move faster without corporate constraints, and the goal of creating a more neutral space for contributions from various organizations.
AG2 inherited the autogen and pyautogen PyPI packages as well as the original Discord community of over 20,000 members. The project adopted the Apache 2.0 license starting from version 0.3. AG2 maintains backward compatibility with AutoGen 0.2's architecture and API, offering stability for existing users while pursuing its own roadmap under community governance.
This split created significant confusion in the ecosystem. AG2 maintained backward compatibility with AutoGen 0.2's API (releasing as version 0.3.2) and continued developing features on the familiar architecture. Meanwhile, Microsoft pursued its complete rewrite as AutoGen 0.4 under the autogen-core and autogen-agentchat package names. Microsoft's team publicly noted that "the current pyautogen package isn't affiliated with Microsoft AutoGen, and admin access is blocked for us," clarifying the package ownership situation.
The fork resulted in four distinct variants of AutoGen-derived software:
| Variant | Maintainer | PyPI Package(s) | Architecture |
|---|---|---|---|
| AutoGen 0.2 (legacy) | Microsoft (maintenance branch) | autogen-agentchat==0.2.* | Original conversable agent design |
| AutoGen 0.4 | Microsoft | autogen-core, autogen-agentchat==0.4.* | New event-driven, actor-based design |
| AG2 | AG2AI community | autogen, pyautogen, ag2 | Fork of 0.2 with community extensions |
| Microsoft Agent Framework | Microsoft | microsoft-agent-framework | Successor merging AutoGen + Semantic Kernel |
Meanwhile, the Microsoft-maintained microsoft/autogen repository underwent a complete architectural redesign, releasing version 0.4 in January 2025 with a fundamentally different codebase.
In October 2025, Microsoft announced the Microsoft Agent Framework, which merges AutoGen's multi-agent orchestration capabilities with Semantic Kernel's enterprise features. The Agent Framework combines AutoGen's agent abstractions with Semantic Kernel's session-based state management, type safety, middleware, telemetry, and graph-based workflows for multi-agent orchestration.
Microsoft released the Agent Framework in public preview on October 1, 2025, targeting a 1.0 General Availability release by the end of Q1 2026. Following this announcement, AutoGen entered maintenance mode, receiving only bug fixes and security patches, while new feature development shifted to the Microsoft Agent Framework.
AutoGen's architecture has evolved significantly across its versions. The framework provides two main API layers in version 0.4: the Core API for low-level, event-driven agent development, and the AgentChat API for high-level, task-driven multi-agent applications.
AutoGen 0.4 introduced a layered architecture separating concerns across three tiers:
| Layer | Purpose | Key Components |
|---|---|---|
| AutoGen Core | The foundation layer, implementing the actor model for agent communication. Agents communicate through asynchronous messages, enabling distributed and scalable deployments. | Agent runtime, message passing, subscriptions, topic-based routing |
| AutoGen AgentChat | A higher-level API built on Core that provides familiar abstractions for rapid prototyping. Designed to maintain similar abstraction levels to AutoGen 0.2 for easier migration. | Pre-built agent types, team implementations, termination conditions, streaming support |
| Extensions | Advanced runtimes, tools, model clients, and ecosystem integrations that expand the framework's capabilities. | Azure code executors, model provider clients, third-party tool integrations |
The Core layer's actor model provided natural concurrency (agents process messages independently), fault isolation (a failure in one agent does not crash others), and scalability (agents can be distributed across multiple processes or machines). Agents communicated through both event-driven (publish/subscribe) and request/response interaction patterns.
The Core API is the foundation layer of AutoGen 0.4. It provides a scalable, event-driven actor framework for creating agentic workflows. The Core API is built on the Actor model, a computational paradigm where each agent operates independently, manages its own state, and communicates with other agents through asynchronous message passing.
Key components of the Core API include:
| Component | Description |
|---|---|
| Agents | Independent units that handle and produce typed messages in response to events |
| Runtime | Manages message delivery between agents, decoupling transport from agent logic |
| Direct messaging | Functions like remote procedure calls (RPC) for point-to-point communication |
| Topic-based messaging | Implements a publish-subscribe (pub/sub) pattern for broadcasting messages |
| Distributed runtime | Uses gRPC to enable agents to run across multiple processes and machines |
The Core API supports cross-language interoperability, allowing agents written in different programming languages to communicate with one another. At launch, AutoGen 0.4 supported Python and .NET, with additional language support planned.
The AgentChat API is a higher-level abstraction built on top of the Core API. It provides a task-driven framework designed for rapid prototyping of interactive multi-agent applications. AgentChat offers preset agent types and team configurations that implement common multi-agent design patterns.
The primary agent types in AgentChat include:
| Agent type | Description |
|---|---|
| AssistantAgent | Redesigned with async methods. Uses on_messages() or on_messages_stream() instead of send(). Can handle tool calling and execution in a single agent (no separate executor needed). Accepts a tools parameter for direct function registration. Can be equipped with tools, memory modules, and custom system prompts. |
| UserProxyAgent | Simplified to focus purely on collecting user input. Acts as a proxy for a human user, soliciting input during a conversation to enable human-in-the-loop workflows. |
| BaseChatAgent | Abstract base class for building custom agents. Requires implementing on_messages(), on_reset(), and defining produced_message_types. Replaces the register_reply() pattern from 0.2. |
| CodeExecutorAgent | A dedicated agent for running code blocks, separating code execution concerns from other agent behaviors. Executes code generated by other agents in a sandboxed environment. |
One of the most notable simplifications was tool use. In 0.2, tools required registration with separate caller and executor agents. In 0.4, a single AssistantAgent can both decide to call tools and execute them:
assistant = AssistantAgent(
name="assistant",
model_client=OpenAIChatCompletionClient(model="gpt-4o"),
tools=[get_weather, search_database]
)
Model client configuration also changed from dictionary-based llm_config to direct instantiation of typed client objects, improving type safety and developer experience.
In AutoGen 0.4, agents are organized into "teams" that implement specific collaboration patterns. Teams serve as the fundamental building block in the AgentChat API, replacing the GroupChat concept from version 0.2.
| Team type | Selection Method | Best For |
|---|---|---|
| RoundRobinGroupChat | Agents take turns broadcasting messages to all participants in a sequential, cyclic order. No LLM-based selection involved. | Predictable, sequential workflows with 2-4 agents. Maximum auditability and debugging ease. |
| SelectorGroupChat | An LLM selects which agent should speak next after each message, enabling dynamic turn-taking. Uses a ChatCompletion model to determine the next speaker based on conversation context. Supports optional custom selector_func for state-based routing. | Dynamic workflows where the next step depends on findings. Adaptive orchestration for complex tasks. |
| Swarm | The next speaker is determined by handoff messages from the current speaker, allowing agents to delegate control explicitly. Agents use HandoffMessage objects to transfer tasks. No central planner; agents self-organize through handoff signals. | Workflows where agents know when to delegate, similar to OpenAI Swarm patterns. Decentralized coordination. |
| MagenticOneGroupChat | Managed by a MagenticOneOrchestrator that handles planning, delegation, progress tracking, and replanning. | Complex, open-ended tasks requiring web research, file handling, and code execution with minimal configuration. |
All team types share a common context: every agent in a team can see all messages from other team members. Teams also support configurable termination conditions that determine when a conversation should end. All teams support two execution methods: run() for synchronous execution until termination, and run_stream() for yielding messages as they are generated in real time. Teams are stateful objects that preserve context between runs unless explicitly reset with the reset() method.
AutoGen 0.4 provides several built-in termination conditions for controlling when agent conversations stop:
| Condition | Behavior |
|---|---|
| MaxMessageTermination | Stops after a specified number of messages |
| TextMentionTermination | Stops when a specific text string appears in an agent's response |
| TextMessageTermination | Stops when agents produce a TextMessage |
| HandoffTermination | Stops when an agent requests a handoff to a specific target |
| SourceMatchTermination | Stops when a particular agent responds |
| ExternalTermination | Enables programmatic termination from outside the running team, useful for UI-driven workflows |
Multiple termination conditions can be combined using logical AND and OR operators (for example using the bitwise OR operator |): TextMentionTermination("DONE") | MaxMessageTermination(20) would stop the conversation when either condition was met. Termination conditions are stateful but reset automatically between runs.
AutoGen 0.4 added built-in state management capabilities that were absent from version 0.2:
| Feature | Details |
|---|---|
| save_state() / load_state() | Built-in methods on agents and teams for serializing and restoring conversation state. Eliminates manual message history management. |
| Checkpoint and resume | Teams can be paused mid-execution and resumed later from the exact point where they stopped. |
| Streaming support | Real-time streaming of agent responses through run_stream(), enabling responsive user interfaces. |
| Memory | Built-in memory capabilities for agents to maintain context across separate interactions and sessions. |
AutoGen 0.4 expanded beyond Python to include .NET support, enabling interoperability between agents built in different programming languages. This was a significant addition for enterprise environments where applications span multiple technology stacks.
AutoGen 0.2, the original version of the framework, introduced the core concepts that made multi-agent conversation systems accessible to developers. While superseded by version 0.4, many of AutoGen 0.2's ideas remain influential in the broader ecosystem, and the AG2 fork continues to maintain this architecture.
The ConversableAgent class was the central abstraction in AutoGen 0.2. It represented a generic agent capable of conversing with other agents through message exchange. After receiving each message, a ConversableAgent would automatically send a reply to the sender unless the message was a termination signal. Developers could customize agent behavior by registering reply functions using the register_reply() method.
AutoGen 0.2 provided two specialized subclasses:
| Agent Type | Description |
|---|---|
| ConversableAgent | The base class for all agents in AutoGen 0.2. Manages message exchange, conversation history, and reply generation through a configurable chain of reply functions. |
| AssistantAgent | A subclass of ConversableAgent designed to act as an AI-powered assistant. Uses LLMs by default to generate responses. Could write and suggest code for the user to execute. Did not require human input or code execution by default. |
| UserProxyAgent | A proxy for human participation that could solicit human input, execute code, or invoke tools. Served as the bridge between human users and the AI agents. Configured with human_input_mode to control when human input was requested. |
The ConversableAgent employed a reply function chain to determine how to respond to incoming messages. By default, the following functions were checked in order:
check_termination_and_human_reply: Checks whether the conversation should end or human input is neededgenerate_function_call_reply: Handles legacy function call responsesgenerate_tool_calls_reply: Handles tool call responses from the LLMgenerate_code_execution_reply: Extracts and executes code blocks from messagesgenerate_oai_reply: Generates a response using the configured LLMWhen a function returned final=False, the next function in the chain was checked. This chain-of-responsibility pattern gave developers fine-grained control over agent behavior.
AutoGen 0.2 supported four primary conversation patterns, each suited to different use cases:
| Pattern | Description | Best For |
|---|---|---|
| Two-Agent Chat | The simplest pattern where two agents communicate directly. One agent calls initiate_chat() on another with an initial message. The conversation continues until a termination condition is met. Results are captured in a ChatResult object containing chat history, summary, and token costs. | Simple question-answering, code generation with review, brainstorming between two perspectives |
| Sequential Chat | A sequence of two-agent chats chained together by a carryover mechanism. Summaries from earlier chats are automatically passed as context into subsequent conversations using initiate_chats(). | Multi-step tasks where each stage depends on the previous one, such as research followed by writing followed by editing |
| Group Chat | Multiple agents participate in a shared conversation thread orchestrated by a GroupChatManager. The manager selects which agent should speak next based on configurable strategies: auto (LLM-based selection), round_robin, random, or manual. Supports speaker transition constraints to control which agents can follow others. | Complex collaborative tasks requiring diverse expertise, debates, multi-perspective analysis |
| Nested Chat | Packages an entire multi-agent workflow into a single agent using register_nested_chats(). When triggered by an incoming message, the agent internally runs a sequence of chats and returns the final summary as its response. | Encapsulating complex sub-workflows as reusable components, hierarchical task decomposition |
The Group Chat pattern was AutoGen 0.2's most distinctive feature. Using a GroupChatManager, multiple agents could participate in a shared conversation, with the manager deciding which agent should speak next based on the conversation context. This enabled complex multi-agent workflows where, for example, a researcher agent, a coder agent, and a critic agent could collaborate on a task through natural conversation. The GroupChatManager supported several speaker selection strategies, including round-robin ordering, random selection, and LLM-based selection where an LLM chose the most appropriate next speaker based on conversation context.
AutoGen 0.2 implemented tool use through a two-step registration process. A tool (function) had to be registered with at least two agents to be useful: one agent registered with the tool's signature through register_for_llm (allowing it to suggest calling the tool), and another agent registered with the tool's function object through register_for_execution (allowing it to actually run the tool). The autogen.register_function convenience method could register a tool with both agents simultaneously.
Tool functions required type hints for all arguments and return values, as these provided the LLM with information about the tool's expected inputs and outputs. This design separated the "thinking" (deciding to call a tool) from the "doing" (executing the tool), which was useful for safety and auditability but added complexity compared to later frameworks.
A distinctive feature of AutoGen from its earliest versions was built-in support for code execution. In a typical workflow, an AssistantAgent would generate code (often Python), and a UserProxyAgent would execute that code in a sandboxed environment, returning the output to the conversation. This created a feedback loop where the AssistantAgent could debug and refine its code based on execution results.
Built-in code execution was one of AutoGen 0.2's most powerful features. The UserProxyAgent could automatically extract code blocks from agent messages and execute them. The framework supported multiple execution environments:
| Environment | Description |
|---|---|
| Local execution | Code runs directly on the host machine. Simplest to set up but least secure, as generated code has full access to the local system. |
| Docker containers | Code runs in isolated Docker containers. Starting with version 0.2.8, Docker execution became the default setting for security. Each code block launches a container, executes, and then terminates. |
| Azure Container Apps | Managed, serverless code execution in the cloud. Azure Container Apps Dynamic Sessions provide ephemeral, Hyper-V isolated sandboxed environments that spin up in milliseconds. |
The code_execution_config parameter controlled execution behavior, including the working directory, whether to use Docker, the Docker image to use, and the timeout for execution. The last_n_messages parameter determined how many previous messages to scan for code blocks, with 'auto' scanning all messages since the agent last spoke.
AutoGen 0.2 gained rapid adoption, particularly in academic and research settings. Its conversation-driven approach was intuitive for building prototypes and exploring multi-agent dynamics. However, users encountered limitations in production environments, including challenges with state management, scalability, error handling, and debugging complex multi-agent interactions.
AutoGen 0.4 introduced asynchronous messaging as a core architectural principle. Agents communicate through async messages, supporting both event-driven and request/response interaction patterns. This design enables more scalable systems compared to the synchronous conversation model in version 0.2.
The framework is designed around pluggable components. Developers can swap out or extend individual pieces, including custom agents, tools, memory modules, and model clients, without modifying the framework's core. This modularity makes it possible to build proactive, long-running agents that go beyond simple request-response patterns.
AutoGen 0.4 provides a unified interface for working with different LLM providers through model clients:
| Model client | Provider |
|---|---|
| OpenAIChatCompletionClient | OpenAI models and OpenAI-compatible APIs (including Gemini) |
| AzureOpenAIChatCompletionClient | Azure OpenAI Service models |
| AnthropicChatCompletionClient | Anthropic Claude models (experimental) |
| AzureAIChatCompletionClient | GitHub Models and Azure-hosted models |
| OllamaChatCompletionClient | Local models via Ollama (experimental) |
Different agents within the same team can use different models, enabling cost optimization. For example, a routing agent might use a smaller, cheaper model for deciding which specialist to invoke, while the specialist agents use more capable (and expensive) models for complex reasoning.
AutoGen provides built-in support for human-in-the-loop workflows through the UserProxyAgent. When a team calls the UserProxyAgent during execution, it pauses the conversation and solicits input from a human user before continuing. This allows humans to provide feedback, correct agent behavior, approve actions, or inject domain knowledge at critical points in a workflow.
AutoGen 0.4 includes stronger observability features compared to earlier versions. Developers can view agent action streams in real time, inspect message flows between agents, and use mid-execution controls to pause conversations, redirect agent actions, or adjust team composition while a workflow is running.
AutoGen Studio is a low-code, web-based interface for building, testing, and sharing multi-agent workflows. Originally introduced as a research prototype by Microsoft Research alongside AutoGen 0.2, AutoGen Studio was rebuilt on the AutoGen 0.4 AgentChat API in January 2025, incorporating the new event-driven architecture and providing a significantly improved developer experience.
The tool provides a declarative, JSON-based specification format for defining agents and teams, along with a drag-and-drop visual interface for composing workflows.
| Feature | Description |
|---|---|
| Visual team builder | Drag-and-drop interface for creating agent teams, adding tools, and defining relationships between agents. Components can be visually wired together to define workflows. |
| JSON configuration | Teams can be defined through direct JSON editing for users who prefer text-based configuration or need precise control over parameters. |
| Real-time monitoring | Live visualization of agent message flows with asynchronous, event-driven updates. Shows which agent is speaking and what actions are being taken as they happen. |
| Mid-execution control | Ability to pause conversations, redirect agent actions, and adjust team composition while a workflow is running. Tasks can be seamlessly resumed after intervention. |
| Interactive user input | UserProxyAgent integration allows users to provide input and guidance during team runs in real time through the UI. |
| Message flow visualization | Intuitive interface that maps message paths and dependencies between agents, helping developers understand and debug complex interactions. |
| Agent library and gallery | A library of pre-defined agents that can be composed into teams, plus a gallery of reusable agent components. Supports importing custom agents, tools, and workflows from external component galleries. |
| Export and deployment | Workflows can be exported as JSON files and deployed as APIs. Exported configurations can be wrapped into Dockerfiles for deployment on cloud services like Azure Container Apps. |
AutoGen Studio is described as a prototyping tool rather than a production-ready application. Microsoft recommends that developers use AutoGen Studio for rapid experimentation and then build production applications using the AutoGen framework directly, implementing their own authentication, security, and deployment features. It serves as both a prototyping tool for developers exploring multi-agent designs and a debugging environment for understanding how agents interact during execution.
A dedicated paper on AutoGen Studio, "AutoGen Studio: A No-Code Developer Tool for Building and Debugging Multi-Agent Systems," was published at EMNLP 2024 (arXiv:2408.15247).
Magentic-One is a generalist multi-agent system built on AutoGen for solving complex, open-ended tasks involving web browsing and file manipulation. Released by Microsoft Research in November 2024, it demonstrates how AutoGen's multi-agent architecture can be applied to challenging real-world benchmarks. The paper was authored by Adam Fourney, Gagan Bansal, Hussein Mozannar, Cheng Tan, and 15 other researchers at Microsoft.
Magentic-One uses a hierarchical architecture centered on an Orchestrator agent that coordinates four specialized agents:
| Agent | Role |
|---|---|
| Orchestrator | The lead agent that creates plans, delegates tasks to other agents, tracks progress toward goals, and dynamically revises plans as needed. It decides which agent to invoke at each step and handles error recovery. Uses an outer loop for high-level planning and an inner loop for step-by-step execution. |
| WebSurfer | Handles browser-based tasks such as navigating websites, clicking links, filling forms, and extracting information from web pages. Built on Playwright for browser automation. |
| FileSurfer | Manages file-related operations including reading documents, navigating directories, and processing local files in various formats. |
| Coder | Writes and analyzes code to create solutions, generate scripts, and process data. Focuses on code generation and analysis without direct execution capability. |
| ComputerTerminal | Executes code and shell commands, performing system-level operations and running programs generated by the Coder. Provides the runtime environment for code execution. |
The Orchestrator uses a structured planning approach: it creates an initial plan based on the task description, assigns sub-tasks to specialized agents, monitors their progress, and replans when agents encounter obstacles or produce unexpected results. This hierarchical delegation pattern is a key differentiator from flat group chat approaches, as it provides more structured control over complex workflows.
Magentic-One achieved competitive performance on several multi-agent benchmarks when evaluated with GPT-4o and o1-preview as its backbone models:
| Benchmark | Score | Notes |
|---|---|---|
| GAIA | ~38% task completion | Statistically competitive with state-of-the-art methods at time of publication |
| AssistantBench | ~27.7% accuracy | Competitive performance on assistant-oriented tasks |
| WebArena (validation) | 35.1% (148/422 tasks) | Web navigation and interaction benchmark |
| WebArena (test) | 30.5% (119/390 tasks) | Slightly lower on the held-out test set; comparable to most SOTA methods except WebPilot and Jace.AI |
| WebArena (overall) | 32.8% | Results statistically comparable to previous state-of-the-art methods at the time of release |
Its modular design means that individual agents can be upgraded independently (for example, replacing the underlying LLM for the Coder agent with a newer model) without restructuring the entire system. The paper demonstrated that Magentic-One achieved these results "without modification to core agent capabilities or to how they collaborate," showing progress toward truly generalist agentic systems. The system was published as a research paper (arXiv:2411.04468) and released as an open-source package within the AutoGen repository.
When AutoGen 0.4 shipped in January 2025, Magentic-One was ported from the lower-level autogen-core library to the higher-level autogen-agentchat API, making it available as the MagenticOneGroupChat team type for easier use by developers.
AutoGen 0.4's core architecture is based on the actor model, a well-established pattern in distributed systems where each agent is an independent "actor" that communicates with other actors exclusively through asynchronous messages. This design provides:
| Property | Benefit |
|---|---|
| Natural concurrency | Agents can process messages independently and in parallel |
| Fault isolation | A failure in one agent does not crash others |
| Scalability | Agents can be distributed across multiple processes or machines |
| Location transparency | Agents communicate the same way whether local or distributed |
The Core layer supports two communication patterns: direct messaging (request/response) and topic-based publish/subscribe for event-driven workflows. Agents register with a runtime that handles message routing, serialization, and lifecycle management.
AutoGen supports a wide range of LLM providers through its model client abstraction:
| Provider | Details |
|---|---|
| OpenAI | GPT-4o, GPT-4, GPT-3.5 Turbo, o1, o3 series |
| Azure OpenAI | Enterprise-managed OpenAI models with Azure security and compliance |
| Anthropic | Claude models (Claude 3.5, Claude 3) |
| Gemini models | |
| Local models | Ollama, LM Studio, and other local inference servers via OpenAI-compatible APIs |
AutoGen provides multiple options for executing code generated by agents, with increasing levels of isolation and security:
| Environment | Security Level | Description |
|---|---|---|
| Local execution | Low | Code runs directly on the host machine. Simplest to set up but provides no isolation from the host system. |
| Docker containers | Medium | Code runs in isolated Docker containers. Default since AutoGen 0.2.8. Each execution launches a container, runs the code, and terminates. |
| Azure Container Apps Dynamic Sessions | High | Managed, serverless, Hyper-V isolated sandboxed environments on Azure. Spin up in milliseconds. Provide a Jupyter environment with pre-installed packages. Recommended for production deployments. |
The CodeExecutorAgent in AutoGen 0.4 provides a dedicated agent type for code execution, separating this concern from the AssistantAgent that generates code. This separation improves security by allowing different trust and isolation levels for code generation versus code execution.
AutoGen integrates with a broad ecosystem of tools and services:
| Integration | Description |
|---|---|
| Web browsing | Playwright-based agents for navigating websites, filling forms, and extracting content |
| File handling | Reading, writing, and processing documents in various formats |
| Database access | Connecting to databases for querying and data manipulation |
| REST APIs | Calling external services and APIs |
| LangChain tools | Interoperability with LangChain's tool ecosystem |
| Azure services | Native integration with Azure OpenAI, Azure Container Apps, and other Microsoft cloud services |
The Extensions layer provides pre-built integrations with popular services and can be extended with custom tools written as Python functions with type annotations.
In October 2025, Microsoft announced the Microsoft Agent Framework, a new open-source platform that merges AutoGen's multi-agent orchestration capabilities with Semantic Kernel's enterprise-grade features. This announcement represented the most significant change in AutoGen's trajectory since its creation.
Microsoft had maintained two parallel agent-building frameworks: AutoGen (focused on multi-agent research and prototyping) and Semantic Kernel (focused on enterprise AI integration with robust state management, telemetry, and middleware). The existence of two overlapping frameworks created confusion for developers about which to use. The Microsoft Agent Framework resolved this by combining the strengths of both into a single, unified SDK.
The Microsoft Agent Framework combines AutoGen's simple agent abstractions (like ChatAgent) with Semantic Kernel's enterprise features, including session-based state management, type safety, filters, telemetry, and extensive model and embedding support. Beyond merging the two, Agent Framework introduces graph-based workflows for explicit multi-agent orchestration and a robust state management system for long-running and human-in-the-loop scenarios.
Key migration mappings from AutoGen to the Microsoft Agent Framework:
| AutoGen Concept | Microsoft Agent Framework Equivalent |
|---|---|
| AssistantAgent | ChatAgent |
| GroupChat / Teams | Workflow orchestrations (graph-based) |
| Tool registration (tools parameter) | Tool abstractions with enhanced typing |
| Message passing | Enhanced messaging with checkpointing and durability |
| autogen-core actor model | Integrated agent runtime with session management |
| Model clients | Expanded model and embedding provider support |
| Date | Milestone |
|---|---|
| October 1, 2025 | Microsoft Agent Framework enters public preview |
| October 2025 | AutoGen and Semantic Kernel enter maintenance mode |
| February 19, 2026 | Release Candidate 1.0 published for both Python and .NET |
| Q1 2026 (end) | General Availability 1.0 targeted |
| Q2 2026 | Process Framework GA planned |
Both AutoGen and Semantic Kernel continue to receive bug fixes and security patches but no new feature investments. Microsoft recommends that teams migrate to the Agent Framework within 6 to 12 months. A detailed migration guide is available on Microsoft Learn.
AutoGen has been applied across a variety of domains. The original paper and subsequent research demonstrated effectiveness in several areas:
| Domain | Application |
|---|---|
| Mathematics | Multi-agent conversations for solving math problems, with agents debating solutions and verifying each other's work |
| Code generation | Automated code writing, execution, and debugging through iterative agent collaboration |
| Retrieval-augmented generation | RetrieveChat combines retrieval-augmented agents with code generation for question answering over custom documents |
| Question answering | Multi-agent systems that decompose complex questions and synthesize answers from multiple perspectives |
| Supply chain optimization | Agents collaborating to model and solve complex optimization problems |
| Online decision-making | Agents operating in interactive text-based environments, making sequential decisions |
| Research workflows / Research assistance | Groups of agents performing literature review, data analysis, and report generation. Multi-agent teams that search literature, analyze papers, and synthesize findings. Agents can divide work by searching different databases, then collaboratively summarize and cross-reference results. |
| Software development | Coder and reviewer agents that write, test, and refine code collaboratively. The coder generates solutions while the reviewer identifies bugs, suggests improvements, and verifies correctness. |
| Data analysis | Agent teams that clean data, run statistical analyses, generate visualizations, and produce reports. Code execution capabilities allow agents to work directly with datasets. |
| Customer service | Conversational agents with specialized knowledge domains handling customer inquiries. Different agents can handle billing, technical support, and product questions, with a router agent directing customers to the right specialist. |
| Education | Tutor agents that adapt to student needs and provide personalized instruction. Multiple agents can take on roles such as instructor, quiz master, and study partner. |
| Web automation | Magentic-One-style teams that navigate websites, extract information, fill forms, and complete online tasks autonomously. |
| Content creation | Writer, editor, and fact-checker agents that collaborate to produce articles, reports, and documentation with built-in quality control. |
| Financial analysis | Agent teams that gather market data, analyze trends, generate forecasts, and produce investment summaries using code execution for quantitative analysis. |
The framework's conversational design also enables a novel interactive retrieval feature: when retrieved context lacks needed information, an LLM-based assistant can request additional retrieval attempts rather than terminating the conversation, allowing for iterative refinement of search results.
AutoGen (and its successor, the Microsoft Agent Framework) competes primarily with CrewAI and LangGraph (part of the LangChain ecosystem) in the multi-agent framework space.
| Dimension | AutoGen / Microsoft Agent Framework | CrewAI | LangGraph |
|---|---|---|---|
| Core paradigm | Conversation-driven agents; evolving to graph-based workflows | Role-based teams with YAML configuration | Graph-based state machines |
| Architecture | Layered (Core + AgentChat + Extensions); agents collaborate through structured dialogue | Crew-Task-Agent hierarchy; agents modeled as team members with specific responsibilities | Nodes and edges with typed state; agent interactions defined as nodes in a directed graph |
| Design philosophy | Natural language interactions with dynamic role adaptation | Organizational metaphor with clear role assignments | Stateful workflows with explicit control flow |
| Execution model | Asynchronous, event-driven message passing | Sequential, hierarchical, or parallel processes | Directed graph with conditional branching |
| Human-in-the-loop | First-class support; configurable at any point in conversation; built-in UserProxyAgent | Supported through agent delegation and agent configuration | Supported through interrupt nodes and breakpoints |
| Code execution | Built-in Docker and Azure sandbox execution | Supported through tool integrations | Supported through custom nodes |
| State management | save_state/load_state on agents and teams | Task-level state within crew execution | First-class graph state with persistence |
| Visual builder / Low-code tooling | AutoGen Studio (drag-and-drop interface) | CrewAI Enterprise dashboard; YAML configuration files | LangGraph Studio for visualization |
| Enterprise backing | Microsoft (integrated with Azure, Semantic Kernel) | Independent startup (CrewAI Inc.) | LangChain Inc. |
| Language support | Python, .NET | Python | Python, JavaScript/TypeScript |
| Framework | Primary Strengths | Best Suited For |
|---|---|---|
| AutoGen | Diverse conversation patterns (round-robin, selector, swarm, magentic-one). Deep Microsoft ecosystem integration. No-code Studio option. Cross-language support (.NET). Transitioning to enterprise-grade Agent Framework. Group decision-making, debate scenarios, flexible conversation patterns. | Conversational tasks, research prototyping, enterprise applications on Azure, scenarios requiring multi-party agent discussion and debate |
| CrewAI | Intuitive role-based syntax readable by non-engineers. YAML configuration. Fastest setup time. Growing Agent-to-Agent (A2A) protocol support. Quick setup for business workflow automation, intuitive role-based modeling. | Structured business workflows, role-based task delegation, teams where readability and fast prototyping matter |
| LangGraph | Explicit graph-based state machines. Durable execution. Built-in retry logic and error recovery. Tight integration with LangSmith for observability. Fine-grained state management, complex conditional logic, production durability. | Complex decision pipelines, stateful applications requiring persistence, workflows with many conditional branches and loops |
CrewAI is generally considered the most beginner-friendly of the three, requiring minimal boilerplate code and offering YAML-based configuration files. AutoGen falls in the middle: its two API layers (Core and AgentChat) offer different abstraction levels, using Pythonic agent classes that are straightforward but require understanding its agent and team abstractions. LangGraph has the steepest learning curve because developers need to understand graph theory concepts such as nodes, edges, and state schemas.
The choice between frameworks depends on the specific use case. AutoGen is particularly well suited for conversational multi-agent systems where agents need to engage in flexible, back-and-forth dialogue. CrewAI works well for structured business workflows where roles are clearly defined. LangGraph is a strong choice for applications requiring precise state management and complex branching logic, especially for teams already using LangChain.
| Framework | Status as of Early 2026 |
|---|---|
| AutoGen | Maintenance mode. Bug fixes and security patches only. New features developed in Microsoft Agent Framework. |
| CrewAI | Active development. Continues to add features and expand its ecosystem. |
| LangGraph | Active development. Integrated with LangSmith for tracing, evaluation, and monitoring. |
| Microsoft Agent Framework | RC 1.0 published February 2026. GA targeted end of Q1 2026. Combines AutoGen and Semantic Kernel. |
AutoGen 0.4 requires Python 3.10 or later. The framework can be installed using pip:
pip install -U "autogen-agentchat" "autogen-ext[openai]"
A basic example using the AgentChat API:
import asyncio
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient
async def main():
model_client = OpenAIChatCompletionClient(model="gpt-4o")
agent = AssistantAgent(
name="assistant",
model_client=model_client,
system_message="You are a helpful assistant.",
)
response = await agent.on_messages(
[TextMessage(content="What is AutoGen?", source="user")],
cancellation_token=CancellationToken(),
)
print(response.chat_message.content)
asyncio.run(main())
For team-based workflows, agents can be composed into teams:
from autogen_agentchat.teams import RoundRobinGroupChat
from autogen_agentchat.conditions import MaxMessageTermination
team = RoundRobinGroupChat(
participants=[agent_1, agent_2],
termination_condition=MaxMessageTermination(max_messages=10),
)
result = await team.run(task="Analyze this dataset and write a summary.")
AutoGen has had a significant impact on the agentic AI landscape since its release. AutoGen has built one of the largest communities in the multi-agent AI space. Key metrics as of early 2026:
| Metric | Value |
|---|---|
| GitHub stars | Over 54,600 |
| Monthly downloads | Over 856,000 |
| GitHub commits | Over 3,776 |
| Contributors | Over 559 |
| Releases | 98 |
| Issues resolved | Over 2,488 |
| Research paper citations | Over 850 (Semantic Scholar) |
The AutoGen paper has been widely cited in academic research on multi-agent systems, LLM agents, and agentic workflows. The framework's concepts, particularly the idea of multi-agent conversation as a programming paradigm, influenced the design of subsequent agent frameworks and contributed to broader industry interest in multi-agent AI systems.
The framework has been particularly popular in academic and research settings, where its conversation-driven approach and built-in code execution make it well-suited for experimental and exploratory work. Enterprise adoption has been growing, particularly among organizations already using Microsoft's cloud platform, and the transition to the Microsoft Agent Framework is expected to accelerate this trend. AutoGen's concepts, including conversable agents, group chat, and human-in-the-loop patterns, have influenced virtually every multi-agent framework that followed. Its pioneering role in demonstrating that complex tasks could be solved through structured multi-agent conversation helped establish an entire category of AI tooling.
AutoGen was recognized by multiple awards and accolades, including the Best Paper award at the LLM Agents Workshop at ICLR 2024, selection into the Open100 top 100 open-source achievements, and being named one of the five best AI papers of 2023 by TheSequence.
As of early 2026, the AutoGen ecosystem exists in several forms:
microsoft/autogen repository. Now in maintenance mode, receiving bug fixes and security patches only. New development has shifted to the Microsoft Agent Framework.Magentic-One continues to serve as a reference implementation and has been updated to work with the Microsoft Agent Framework. AutoGen Studio has been adapted to the new framework, maintaining its visual building capabilities while gaining access to the broader feature set. The Magentic orchestration pattern is available as a first-class orchestration option in the Agent Framework.
The multi-agent framework landscape remains highly competitive, with CrewAI, LangGraph, OpenAI's Agents SDK, Google's Agent Development Kit, and Amazon's Strands Agents all competing for developer adoption. AutoGen's legacy as one of the pioneering frameworks in this space is secure, and its future continues through the Microsoft Agent Framework ecosystem.
Microsoft provides a migration guide for users transitioning from AutoGen to the Microsoft Agent Framework, available on Microsoft Learn.