Open Interpreter
Last reviewed
May 20, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 ยท 4,407 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 20, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 ยท 4,407 words
Add missing citations, update stale details, or suggest a clearer explanation.
Open Interpreter is an open-source desktop agent, distributed as a Python command-line tool and library, that lets a large language model write and execute code locally on the user's machine. Created by Killian Lucas and first released to PyPI in mid-2023, the project provides a ChatGPT-like terminal interface in which the model can run Python, JavaScript, Shell, and several other languages with explicit user confirmation before each code block runs.[^1][^2] It is licensed under the GNU Affero General Public License v3.0 and is one of the most-starred AI agent projects on GitHub, with the main repository accumulating tens of thousands of stars in its first months.[^1][^3] Open Interpreter was conceived as an unrestricted local alternative to OpenAI's hosted ChatGPT Code Interpreter (now Advanced Data Analysis), removing limits on file size, runtime, packages, and network access.[^4][^5]
| Field | Value |
|---|---|
| Type | Desktop AI agent / code-execution loop |
| Creator | Killian Lucas |
| Organization | Open Interpreter (open-source project) |
| First public release | July 2023 (PyPI), launch announced September 6, 2023[^2][^6] |
| Language | Python (98% of repository)[^1] |
| License | AGPL-3.0 (with commercial license available) |
| Repository | github.com/OpenInterpreter/open-interpreter |
| Default model | GPT-4o via the OpenAI API[^7] |
| Supported runtimes | Python, JavaScript, Shell, AppleScript, HTML, R, PowerShell, Ruby, Java, React[^8] |
| Notable subprojects | The 01 (voice interface for desktops, mobile, and ESP32) |
| Documentation | docs.openinterpreter.com |
The project was started by Killian Lucas in mid-2023, motivated by the limitations of OpenAI's hosted Code Interpreter product. OpenAI had released ChatGPT Code Interpreter (later renamed Advanced Data Analysis) in July 2023, providing ChatGPT with a sandboxed Python environment for data analysis tasks.[^4] The hosted product was constrained: no general internet access, a fixed 100 MB upload limit, a 120-second runtime ceiling per cell, and a curated set of pre-installed libraries.[^5] Lucas built Open Interpreter to remove those constraints by running the loop locally on a user's own machine, where the agent could access the file system, the network, and any library a developer could install.[^5]
The first PyPI release of the open-interpreter package appeared in July 2023, with very early versions in the 0.0.x series being published within days of the initial commit.[^2] A wider public launch followed on September 6, 2023, when Lucas posted a Show HN thread titled "Open Interpreter, CodeLlama in your terminal, executing code" on Hacker News and a launch tweet from his personal account.[^6][^9] Within days the repository became the top-trending repository on GitHub, and a ThursdAI interview noted it reached roughly 23,000 stars in its first week.[^10] Coverage on Simon Willison's blog and elsewhere positioned it as "the open-source ChatGPT Code Interpreter for the terminal," highlighting its capacity to summarize PDFs, plot datasets, and control a local browser through generated Python.[^9][^7]
The first significant architectural revision was the September 26, 2023 Generator Update, which rebuilt the core message-and-execution loop on top of Python generators so that the package could stream tokens, observations, and execution results, making it easier to embed the agent inside other applications.[^11] On October 17, 2023, the Local II update added first-class support for users routing requests through local model servers such as Ollama, LM Studio, LlamaFile, and Jan.ai, and broadened the catalog of providers that the agent could call.[^11] On November 15, 2023, the Vision I update introduced multimodal input so that screenshots, image attachments, and visual feedback could be passed to vision-capable models such as GPT-4V.[^11]
On January 5, 2024, the team published "The New Computer Update I," which the blog described as the most significant rewrite since version 0.1.0. The update introduced the Computer API, "a standard interface between language models and computers," along with the OS mode flag --os that lets the model graphically control a desktop.[^12] The Computer API exposed methods such as computer.display.view() for taking screenshots, computer.mouse.click() and computer.mouse.move() for interaction with on-screen elements, and computer.clipboard.view() for clipboard access.[^12] A hosted preview of the API briefly ran at api.openinterpreter.com. The update also separated the Computer module from the core, allowing the execution backend to be used independently of the agent loop, and introduced LMC (Language Model Computer) Messages, a message format with a dedicated computer role distinct from user, assistant, and system.[^12] A follow-on New Computer Update II on March 10, 2024 expanded the Computer API, accelerated launch speed roughly fivefold according to the changelog, and bundled a smaller GUI-understanding local vision model.[^11]
On March 22, 2024, Lucas and the team announced the 01 Developer Preview, an open-source platform "for the next generation of AI devices." The flagship 01 Light was described as "a portable voice interface that controls your home computer. It can see your screen, use your apps, and learn new skills," and was offered at a US$99 price point with build-your-own schematics available at openinterpreter.com/01.[^13][^14] The companion 01 repository contained server, mobile, ESP32, and desktop client code, with the codebase roughly evenly split between Python, C++, and Swift.[^15]
On June 18, 2024, Open Interpreter released Local III, which the team framed as their largest step toward purely offline agents. Local III introduced a free hosted endpoint serving Llama 3-70B available with interpreter --model i, recommended Codestral and Llama 3 as the most reliable open models for local use, used the Moondream tiny vision model to render local images as descriptions plus OCR for text-only LLMs, and added an open-source Point model so local OS-mode agents could click icons identified by name rather than fixed coordinates.[^16] The team stated they would collect opt-in conversations with the hosted i endpoint, strip personally identifying material, and use them to train an open model and release the dataset.[^16]
On September 9, 2024, in a blog post titled "It should have been an app," Open Interpreter announced that it was canceling production of the 01 Light hardware, refunding all pre-orders, and pivoting to a free iOS and Android app that delivered the same push-to-talk voice control over a user's Mac, Windows, or Linux machine.[^17] The team, described as five people without prior hardware experience, concluded that the ESP32-class hardware could not match the performance, security, and global availability of modern smartphones, and that their core value lay in software.[^17] The 01 codebase remained AGPL-3.0 open source, complete schematics and design files were published, and third-party manufacturers, including a team called Human Devices, began producing compatible developer kits.[^17][^15]
The October 24, 2024 release line (versions 0.4.0 through 0.4.2) added a redesigned interpreter --os mode powered by Anthropic Computer Use tools, plus screenpipe and dynamic-tools demos.[^3] Version 0.4.3 followed on October 26, 2024 as the latest stable PyPI release at the time of writing, with Python 3.9 through 3.12 supported.[^2] The team has publicly previewed a more substantial 1.0 rewrite on the development branch, but no 1.0 stable tag had been published to PyPI as of late 2024 in the project's release history.[^3][^2]
Open Interpreter wraps a chat-style model around a code-execution sandbox. At each turn the user types a natural-language instruction; the model, which has been instructed via a system message that it controls a real machine, decides whether to reply in text or to emit a fenced code block tagged with a language identifier. Internally, the package equips the model with what its docs describe as an exec() function that accepts a language plus a code string.[^1] When the model emits code, the agent intercepts the block, prints the proposed code with syntax highlighting, and (unless auto_run is enabled) asks the human "Would you like to run this code? (y/N)" before sending it to the matching interpreter. Standard output, errors, and any returned values stream back into the conversation as the result of a tool call, which the model can then read, summarize, or act on with another code block. This loop is structurally a ReAct-style observe-think-act cycle specialized for code.[^7][^18]
Because the execution backend runs locally in the same process or in a subprocess tied to the user's working directory, the agent has access to whatever the user has access to: installed Python packages, system shells, the local file system, and the network. This is what gives Open Interpreter its expansive feature surface relative to a hosted code-execution sandbox, and it is also the source of its safety risks.[^4][^5]
The execution backend ships with built-in support for multiple interpreters and shells. The README and documentation list Python, JavaScript, Shell, AppleScript, HTML, R, and PowerShell among the core targets, with community-contributed languages including Ruby and Java also referenced in the user guide.[^1][^8] Each language is implemented as a class that knows how to spawn or attach to the relevant runtime, stream output, and report exit status; the documentation notes that the language list is intentionally extensible and developers can append their own custom languages without modifying the agent core.[^8] The Python backend in particular runs inside the same interpreter process by default so that users can pre-define functions, log into services, or set environment variables before the agent starts generating code, and the model's variables and the user's variables share a single namespace.[^8]
The messaging layer extends OpenAI's standard chat role schema with an additional computer role used to represent code execution events and their outputs. This LMC (Language Model Computer) format, introduced with the New Computer Update I, allows the agent to round-trip arbitrary execution traces through any model that follows OpenAI's chat completions schema.[^12] The Computer module is exposed as a Python API rather than a single function call: methods are grouped under computer.display, computer.mouse, computer.keyboard, computer.clipboard, plus Mac-only helpers under computer.mail, computer.sms, computer.contacts, and computer.calendar.[^19] The display submodule provides screenshot capture and the centre-of-screen coordinate; mouse and keyboard methods accept either pixel coordinates, target text strings (resolved via OCR), or named on-screen icons; clipboard methods read and write text.[^19] On macOS, the Mail, SMS, Contacts, and Calendar helpers wrap AppleScript to retrieve unread counts, send messages, look up contacts, and create or delete calendar events.[^19]
Open Interpreter is model-agnostic. Under the hood it uses LiteLLM to translate calls to a single unified API into whichever provider's chat completion endpoint is configured, which gives it support for OpenAI API models, Anthropic API models such as Claude, Azure OpenAI, Mistral, Hugging Face, and many others, plus any locally hosted OpenAI-compatible endpoint exposed by Ollama, LM Studio, LlamaFile, or Jan.ai.[^7][^20] The default model is GPT-4o; older releases used GPT-4 and offered GPT-3.5 turbo as a cheaper fallback, while local-only operation is provided via interpreter --local.[^7][^20] Documentation specifically calls out that even local vision-capable models such as LLaVA via LlamaFile, LM Studio, or Jan.ai can be wired up to OS mode, although smaller open models perform substantially worse at the OCR and click-grounding stages than frontier vision models.[^21]
Activated by the --os flag, OS mode upgrades the agent from a code-only interface to a full visual desktop controller. In this mode the model is provided with the Computer API and instructed to navigate the desktop graphically: it can request a screenshot, see the current display, decide what to click, type, or hotkey, and verify results by taking another screenshot.[^21][^12] The documentation explicitly warns that OS mode is "highly experimental" and is most reliable on Mac, where Spotlight and AppleScript shortcuts let the agent pick efficient paths such as opening a URL with query parameters rather than clicking through a UI.[^21] The October 2024 0.4.0 release added an alternative OS mode powered by Anthropic's Computer Use tool family, giving Claude models a dedicated tool surface for desktop automation that runs alongside the original implementation.[^3][^22]
Open Interpreter executes arbitrary, model-generated code on the user's machine, so the project takes an explicit, layered approach to safety while disclaiming any guarantee of security.[^23] Its documentation lists five complementary mitigations.
LLM alignment. The baseline assumption is that a well-aligned frontier model will refuse to produce destructive shell commands such as rm -rf / and will warn the user before suggesting irreversible operations. The team recommends sticking with frontier-aligned models when working on sensitive systems, since locally hosted, weakly aligned open models are noted as more likely to comply with dangerous instructions.[^24]
Confirm-before-run. By default, every code block the model proposes is paused at a confirmation prompt that displays the code and asks the user to type y to run, n to reject, or to add follow-up instructions. The flag --auto-run (or the equivalent auto_run = True in the Python API) bypasses this gate and is explicitly flagged in the docs as "use with extreme caution."[^23][^7]
Safe mode. An experimental opt-in safe mode, installable via pip install open-interpreter[safe], integrates Semgrep to statically scan generated code for known dangerous patterns before execution and supports three settings: off, ask, and auto. The documentation also recommends instructing the model via the system prompt to scan PyPI and npm packages with Guarddog before installing them, adding a supply-chain dependency audit step.[^25]
Sandboxing. For higher-risk workloads the docs describe two isolation paths. The first is an experimental Docker integration that runs the execution backend inside a container so that any destructive command is confined. The second is an E2B integration that substitutes a hosted, ephemeral cloud sandbox for the local Python kernel. Both are presented as optional layers that complement rather than replace user confirmation.[^26]
User vigilance. The safety documentation is blunt that none of these mitigations is a guarantee. The maintainers state explicitly that Open Interpreter "is not responsible for any damage caused by using the package" and that the package's safety features "provide no guarantees of safety or security," recommending that users review every code block before granting execution.[^23]
Open Interpreter's behaviour is configured via YAML profile files, with any file named default.yaml loaded automatically and others selectable via interpreter --profile myprofile.yaml.[^27] A profile can set the LLM model, temperature, system message, custom_instructions text appended to the system prompt, auto_run, safe_mode, the active language list, and any other documented setting.[^27] The documented recommendation is to use custom_instructions rather than overriding the system message directly, so that core upgrades in future releases still apply.[^27] Profiles double as shareable units: a YAML file (or, in newer development branches, a Python file per a 2024 RFC) encapsulates a complete persona, model choice, and tool configuration that a user can hand to a teammate.[^28] The Python API mirrors the same configuration surface; an embedded interpreter object can be parameterised in code and then used via interpreter.chat() for a synchronous conversation, with streaming variants for tighter integrations such as the FastAPI server example shipped in the repository.[^1]
The 01 (openinterpreter/01) is the project's voice-first sibling, presented in the README as "the #1 open-source voice interface for desktop, mobile, and ESP32 chips."[^15] It pairs a server (a Light Server tuned for low-power ESP32 hardware, and a LiveKit Server for higher-resource clients) with clients for Android, iOS, ESP32, and desktops. The voice loop transcribes speech to text via a speech-to-text engine such as Whisper, routes the transcribed instruction through Open Interpreter, and synthesizes the spoken reply using a text-to-speech engine.[^15] The hardware 01 Light, briefly sold in 2024, was the canonical reference design before the project's pivot away from manufacturing, after which the open-source schematics remained available for community builds.[^17][^15]
After the September 2024 hardware cancellation, the team launched a free 01 App for iOS and Android that exposes the same voice interface on existing smartphones, controlling the user's Mac, Windows, or Linux machine remotely via push-to-talk.[^17] The app reuses the open-source 01 server stack and was positioned as a replacement for the cancelled hardware that addressed power, security, and global-availability gaps the team did not believe they could solve at a US$99 hardware price point.[^17]
A long tail of community forks and derivative agents has emerged around Open Interpreter, including mirrors and reimplementations on PyPI such as open-code-interpreter, packaged Docker images, and integrations into orchestration frameworks. The main repository tracks several thousand forks alongside its star count, and the project sits as one of the larger AGPL-licensed AI agents on GitHub.[^1]
Reported usage patterns of Open Interpreter cluster into a few recurring areas. The dominant pattern is local data wrangling, where the agent is used as an interactive analyst that converts file formats, plots datasets, cleans CSVs, summarizes PDFs, and runs visual exploratory analyses, much as ChatGPT Code Interpreter does but without the upload limit.[^5][^29] A second pattern is system automation: the model is asked to find the largest files on disk, batch-rename images, set up a development environment, or chain together shell utilities, with the user gating each generated command.[^7] A third pattern, enabled by OS mode and the Computer API, is desktop GUI automation, where the agent opens applications, fills out forms, scrapes screen state, and combines visual interaction with shell or Python steps; this is presented as the most experimental of the three.[^21] In the Windows context, Open Interpreter has been described as one of the more versatile general agents because it can drive a browser, switch to PowerShell, and then open Excel within a single session, combining UI control and code execution in ways pure vision agents typically cannot.[^30]
Beyond individual end users, the project has been embedded into developer-facing infrastructure: the documentation includes a FastAPI server example for serving Open Interpreter behind a REST endpoint, a Docker recipe for containerised deployment, and guides for running multiple concurrent instances.[^1][^26]
The most prominent critique is security exposure. Because Open Interpreter is fundamentally a loop that executes model-authored code on a real machine, a sufficiently misaligned model or a successful prompt injection can in principle execute any command the user could.[^9][^23] Simon Willison, after running the tool, characterised this directly: "code is run directly on your machine, [so] there are all sorts of ways things could go wrong if you don't carefully review the generated code before hitting 'y'."[^7] The safety documentation itself acknowledges that Safe Mode, Docker, and E2B reduce but do not eliminate this risk, and that users must treat every confirmation prompt as load-bearing.[^23]
A second limitation is codebase awareness. Comparative writeups note that Open Interpreter has no built-in project understanding: each session starts without knowledge of the user's repository structure, import graph, or conventions, in contrast to coding-focused agents such as Aider or Claude Code that maintain a project map and integrate with git.[^31] As a result, Open Interpreter is often described as a better general-purpose computer assistant than a software engineering assistant: it can script, automate, and analyse, but it does not refactor large codebases or open pull requests by design.[^31]
Third, the project is not v1.0 at the time of writing. The PyPI history runs through the 0.4.x series, and the release page on GitHub shows mostly pre-release tags, with the team publicly noting that v1.0 is in development on the unstable branch.[^2][^3] Combined with the September 2024 hardware cancellation, this reflects the genuine difficulty of stabilising a desktop agent that touches hardware, voice, vision, and arbitrary code execution simultaneously.[^17]
Fourth, OS mode in particular is documented by the maintainers as "highly experimental," and reliability is uneven across operating systems. The macOS-only Mail, SMS, Calendar, and Contacts helpers are convenience wrappers around AppleScript and have no equivalent on Windows or Linux, and OCR-based clicking can degrade when scaling, fonts, or theming differ from what the underlying vision model expects.[^21][^19]
Finally, the AGPL-3.0 license combined with a separate commercial-license offering can be a friction point for downstream integration. Embedding Open Interpreter inside a closed-source product or a network-exposed service usually triggers the AGPL's source-availability obligations, prompting some commercial users to either rebuild equivalent functionality, license alternative agents, or negotiate a commercial license from the project's maintainers.[^1]
The table below contrasts Open Interpreter with adjacent agentic systems that are commonly mentioned alongside it. Each row's wikilinks are deduped only within that row.
| Tool | Scope | Editing model | Code execution | Codebase awareness | Default model lock-in |
|---|---|---|---|---|---|
| Open Interpreter | General desktop automation and code execution | Generates code that runs as a side effect | Local exec via Python, JS, shell, AppleScript and more, confirm-before-run | None built in | Provider-agnostic via LiteLLM[^7][^20] |
| OpenAI ChatGPT Code Interpreter / Advanced Data Analysis | Hosted data analysis | Edits files inside hosted sandbox | Hosted Python in a 100 MB, 120-second sandbox without internet | None | Locked to OpenAI hosted models[^4][^5] |
| Claude Code | Software engineering agent in the terminal | Multi-file edits with planning, runs tests | Permission-gated shell execution | Indexes the repository and reasons about cross-file changes | Locked to Anthropic models[^31] |
| Cursor (code editor) | AI-native IDE | Inline edits across files with chat side panel | Runs commands through the editor | IDE-level project context | Multiple providers, polished default[^31] |
| Aider | Git-integrated pair programmer in the terminal | Conversational, incremental edits committed to git | Runs tests and linters after edits | Tracks files added to the chat and git history | Provider-agnostic[^31] |
The differences are also philosophical. Open Interpreter optimises for breadth: any task that can be expressed as code on a user's computer is in scope, from converting a video file to clicking through a settings panel.[^21] Claude Code, Cursor, and Aider, in contrast, optimise for depth on source code, integrating with project structure, git, and IDE tooling to make multi-file software engineering safer and more predictable.[^31] OpenAI's hosted Code Interpreter sits at a third corner of the design space, sacrificing breadth and depth for a managed, sandboxed environment that avoids the local-execution risk surface entirely.[^4][^5]
Open Interpreter sits in the broader family of code-executing AI agents that emerged after ChatGPT plugins and Code Interpreter normalised the idea of LLMs invoking real tools. Adjacent projects in this lineage include Auto-GPT and other autonomous task-runner systems, LangChain agents that wrap tool use around foundation models, and Anthropic Computer Use, which generalises the OS-mode idea to a first-class API on the model provider's side. On the AI coding agent axis, peers and competitors include Claude Code, Aider, Cline (AI coding agent), GitHub Copilot, Codex (OpenAI) and its newer OpenAI Codex CLI, and Cursor (code editor); on the voice-and-device axis the 01 sits in the same conceptual neighborhood as the Rabbit R1 and other voice-first ambient-AI hardware experiments.[^15][^31]
The Computer API and OS mode also overlap conceptually with research on computer use and computer-use agent systems that began to appear in 2024 from major labs, where a vision-capable model is given a screenshot tool and a mouse/keyboard tool and asked to drive an operating system. Open Interpreter's contribution to that lineage was to ship an open-source implementation with a uniform Python API and to integrate the resulting clicks and keystrokes with the same agent that already executed shell and Python on the host machine.[^12][^22]