# OpenRouter

> Source: https://aiwiki.ai/wiki/openrouter
> Updated: 2026-06-21
> Categories: Artificial Intelligence, Developer Tools, Large Language Models
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

[![Openrouter logo1.jpeg](https://qqcb8dyk5bp2il4c.public.blob.vercel-storage.com/images/openrouter_logo1.jpeg)](/wiki/file_openrouter_logo1_jpeg)

**OpenRouter** is a unified API gateway and marketplace that routes a single, [OpenAI](/wiki/openai)-compatible request across more than 400 [large language models](/wiki/large_language_model) (LLMs) and other AI models from over 60 providers, automatically selecting hosts for cost, speed, and reliability while consolidating billing into one account.[1][2][43] Founded in early 2023 by Alex Atallah, co-founder and former CTO of the NFT marketplace [OpenSea](/wiki/opensea), and engineer Louis Vichy, the platform processes roughly 25 trillion tokens per week (about 100 trillion per month) for more than 8 million developers, and in May 2026 raised a $113 million Series B led by Alphabet's CapitalG at a post-money valuation of approximately $1.3 billion.[3][4][53][54] Often described as an "LLM router" or "API aggregator," OpenRouter sits between application developers and model providers, handling request routing, automatic failover, load balancing, and billing through a single API endpoint compatible with the [OpenAI](/wiki/openai) Chat API specification.[5]

## Overview

OpenRouter functions as an intermediary service and the first [LLM](/wiki/llm) marketplace, normalizing access to various AI models through a consistent API schema compatible with [OpenAI](/wiki/openai)'s Chat API.[5] This allows developers to switch between different [LLM](/wiki/llm) providers without changing their code implementation, addressing the problem of "API sprawl" where developers would otherwise need to maintain separate integrations for each model provider.[3][5]

The platform operates as a remote-first company headquartered in New York City, with additional offices in San Francisco, California.[2][6] As of mid-2026, the company serves more than 8 million developers worldwide, with more than 250,000 applications built on the platform and roughly 25 trillion tokens processed each week, equivalent to about 100 trillion tokens per month and a pace of over a quadrillion tokens per year.[44][53][54]

The service supports a growing catalog of over 400 models from 60+ providers including [OpenAI](/wiki/openai), [Anthropic](/wiki/anthropic), Google, [Mistral AI](/wiki/mistral_ai), [Meta](/wiki/meta), [DeepSeek](/wiki/deepseek), [xAI](/wiki/xai), [MiniMax](/wiki/minimax), [Moonshot AI](/wiki/moonshot_ai), [Zhipu AI](/wiki/zhipu_ai), and various open-source implementations.[1][7] Beyond text generation, the catalog now spans image generation, audio, embeddings, rerankers, and (as of April 2026) video models from providers like Google, OpenAI, ByteDance, Alibaba, and Kuaishou.[43] The platform adds approximately 15 to 25 milliseconds of edge latency to end-to-end inference for most requests while providing automatic failover, load balancing, and unified billing.[2][8][30]

A core value proposition of OpenRouter is vendor neutrality. Developers can experiment with models from competing providers side by side without managing separate accounts, API keys, or billing relationships. This has made the platform especially popular among indie developers, AI startups, and teams building agentic workflows that need to call different models for different subtasks within a single pipeline.

## History

OpenRouter was founded in February/March 2023 by Alex Atallah, shortly after witnessing the emergence of open-source LLMs like [Meta](/wiki/meta)'s [LLaMA](/wiki/llama).[3] The initial inspiration came after Atallah observed models like Stanford's Alpaca, which demonstrated that smaller teams could create competitive AI models with minimal resources. This suggested a future ecosystem with numerous specialized models, potentially requiring a marketplace to effectively navigate them.[3]

Atallah, who previously co-founded the $14 billion NFT marketplace [OpenSea](/wiki/opensea) and served as Director of Product & Engineering at Kaggle, partnered with engineer Louis Vichy and one of his collaborators from the browser extension framework Plasmo to launch OpenRouter.[3][9][10] Before building OpenRouter, Atallah also created Window AI, a browser extension that connects LLMs to the web, which served as an early experiment in providing unified access to multiple language models.[31]

In its early stages, OpenRouter processed approximately 10 trillion tokens annually. By mid-2025, it had scaled to over 100 trillion tokens per year, representing a 10x growth.[7] By February 2026, weekly token consumption on the platform reached approximately 12.1 trillion tokens, a 12.7-fold year-over-year increase.[29] By May 2026, weekly volume had grown to 25 trillion tokens, roughly 100 trillion per month and a 5-fold increase from the 5 trillion tokens processed per week six months earlier.[53][54] A key milestone was serving as the exclusive launch partner for [OpenAI](/wiki/openai)'s first coding model, [GPT-4](/wiki/gpt-4).1. The model initially appeared on the platform under the codename "Quasar Alpha" in April 2025, and was later revealed to be a stealth endpoint for an early version of GPT-4.1.[32][33]

### Funding

OpenRouter has raised three rounds of venture funding since 2023. In June 2025, it announced back-to-back seed and Series A rounds totaling $40 million, reaching a post-money valuation of approximately $547 million.[10][11][12][53] On May 26, 2026, the company announced a $113 million Series B led by CapitalG, Alphabet's independent growth fund, at a post-money valuation of approximately $1.3 billion, more than doubling its valuation in under a year.[53][54][55] Participants in the Series B included NVentures (the venture arm of [NVIDIA](/wiki/nvidia)), ServiceNow Ventures, MongoDB Ventures, Snowflake Ventures, Databricks Ventures, AMP PBC, and Pace Capital, alongside existing investors Andreessen Horowitz and Menlo Ventures.[53][55] OpenRouter said it intends to use the new funding to expand its routing, governance, and optimization capabilities as enterprises continue to deploy and scale AI.[54]

| Round | Date | Amount | Lead Investor | Other Participants | Post-Money Valuation |
| --- | --- | --- | --- | --- | --- |
| Seed | June 2025 | $12.5 million | Andreessen Horowitz | Sequoia Capital, Soma Capital | - |
| Series A | June 2025 | $28 million | Menlo Ventures | Sequoia Capital, Transpose Platform Management | ~$547 million |
| Series B | May 2026 | $113 million | CapitalG | NVentures (NVIDIA), ServiceNow Ventures, MongoDB Ventures, Snowflake Ventures, Databricks Ventures, AMP PBC, Pace Capital, a16z, Menlo Ventures | ~$1.3 billion |
| **Cumulative** |  | **~$153.5 million** |  |  |  |

Monthly customer spending through the platform grew from $800,000 in October 2024 to approximately $8 million in May 2025, a ten-fold increase in seven months.[7] By late 2025, annualized inference spend routed through the platform exceeded $100 million, up from roughly $10 million in late 2024.[10] By early 2026, third-party equity research from Sacra estimated OpenRouter's annualized revenue at approximately $50 million, up from about $19 million at the end of 2025 and roughly $5 million in May 2025, reflecting the company's roughly 5% take rate on inference spend.[47][56]

### Timeline of Key Events

| Date | Event |
| --- | --- |
| February/March 2023 | OpenRouter founded by Alex Atallah and Louis Vichy |
| 2023 | Early growth phase; platform processes ~10 trillion tokens annually |
| October 2024 | Monthly customer spending reaches $800,000 |
| April 2025 | Quasar Alpha (stealth GPT-4.1) launched exclusively on OpenRouter |
| May 2025 | Monthly customer spending reaches ~$8 million |
| June 2025 | $40 million raised in combined Seed and Series A at ~$547 million valuation |
| October 2025 | Provider Variance (Exacto) feature launched to address performance differences across providers |
| December 2025 | Response Healing feature introduced; State of AI report published with a16z; ranked #1 on Brex's fastest-growing AI infrastructure list |
| January 2026 | Fast LLM prioritization, auto router customization, and SDK skill loading released |
| February 2026 | Benchmarks added to model pages; free model router launched; weekly token volume reaches 12.1 trillion |
| March 2026 | Auto Exacto adaptive quality routing released |
| April 2026 | Workspaces (Project Environments), enhanced Auto Exacto with 5-minute re-evaluation, and video generation API launched (Sora 2 Pro, Veo 3.1, Seedance 2.0, Wan 2.7, Kling Video O1) |
| May 2026 | $113 million Series B led by CapitalG at ~$1.3 billion valuation; weekly volume reaches 25 trillion tokens |

## Founders and Leadership

### Alex Atallah

Alex Atallah is the CEO and co-founder of OpenRouter. He is a Stanford University alumnus and also studied cybersecurity at the University of Oxford.[31][34] Before founding OpenRouter, Atallah had a notable career in technology:

| Role | Organization | Period | Notes |
| --- | --- | --- | --- |
| Cybersecurity Engineer | Palantir | Early career | Built cybersecurity products |
| CTO | hostess.fm | Pre-2014 | Music startup acquired by Beatport in 2014 |
| Director of Product & Engineering | Kaggle | Pre-2018 | Data science competition platform (acquired by Google) |
| Co-founder and CTO | [OpenSea](/wiki/opensea) | 2018-2022 | NFT marketplace reaching $14 billion valuation |
| Creator | Window AI | 2023 | Browser extension connecting LLMs to the web |
| CEO and Co-founder | OpenRouter | 2023-present | LLM router and marketplace |

Atallah stepped down from OpenSea in July 2022, stating he wanted to "build something from zero to one," while remaining on the company's board.[34] His experience building marketplace infrastructure at OpenSea directly informed the design of OpenRouter as a model marketplace. Announcing the Series B in May 2026, Atallah framed multi-model routing as the core of the business: "Running inference at scale is fundamentally a multi-model problem. The era of picking a single model is over. Success now depends on continuously routing across a changing market."[54]

### Other Key Figures

- **Louis Vichy**: Co-founder and Engineer. Previously collaborated with Atallah on the Plasmo browser extension framework.[2][3]
- **Chris Clark**: Chief Operating Officer. Oversees business operations and growth strategy.[7]

### Investors

Notable investors include:[28][7][53]

- Andreessen Horowitz (a16z), Seed lead
- Menlo Ventures, Series A lead
- CapitalG (Alphabet), Series B lead
- NVentures (NVIDIA)
- ServiceNow Ventures, MongoDB Ventures, Snowflake Ventures, Databricks Ventures
- Sequoia Capital
- Soma Capital
- Transpose Platform Management
- Y Combinator

## Technical Implementation

### API Architecture

OpenRouter provides a single API endpoint at `https://openrouter.ai/api/v1` that implements the [OpenAI](/wiki/openai) API specification for `/completions` and `/chat/completions` endpoints.[13] The platform normalizes request/response schemas across providers to reduce per-vendor integration work, while still allowing provider-specific options to pass through when needed.[14] Because it follows the OpenAI SDK format, most applications that already use the OpenAI Python or TypeScript SDK can switch to OpenRouter by changing only the base URL and API key.

Key technical features include:

- Bearer token authentication with OAuth PKCE support for user delegation[15]
- GitHub secret scanning partnership for exposed key detection[15]
- SSE streaming for real-time response handling[13]
- Normalized token counting based on the GPT-4o tokenizer for consistency[13]
- Edge infrastructure that minimizes latency by running close to users globally[30]

### Supported Parameters

The API supports comprehensive parameters for model configuration:[14]

| Parameter | Description | Range/Values |
| --- | --- | --- |
| temperature | Controls response randomness | 0-2 |
| max_tokens | Maximum response length | Model-dependent |
| top_p | Nucleus sampling threshold | 0-1 |
| frequency_penalty | Reduces repetition of frequent tokens | -2 to 2 |
| presence_penalty | Reduces repetition of any used tokens | -2 to 2 |
| stream | Enables SSE streaming | true/false |
| tools | Function calling configuration | JSON schema |
| response_format | Enforces structured output | JSON schema |

### Provider Routing

OpenRouter employs intelligent routing algorithms that consider multiple factors when directing requests to providers.[16] The default load balancing strategy operates in three tiers:

1. **Stability Filter**: Providers without significant outages in the last 30 seconds receive priority.
2. **Price-Weighted Selection**: Among stable providers, the system selects based on the inverse square of the price. For example, a provider costing $1/million tokens is 9x more likely to be selected than one costing $3/million tokens.
3. **Fallback Hierarchy**: Remaining providers serve as backups if primary options fail.

Developers can customize routing through the `provider` object in API requests:

| Configuration | Purpose |
| --- | --- |
| `order` | Specify preferred providers in sequence |
| `sort` | Prioritize by price, throughput, or latency |
| `allow_fallbacks` | Enable or disable backup providers |
| `only` / `ignore` | Whitelist or blacklist specific providers |
| `max_price` | Set cost ceiling per request |
| `data_collection` | Set to "deny" to exclude providers that log training data |
| `zdr` | Set to true for zero data retention enforcement |
| `require_parameters` | Route only to providers supporting all request parameters |

#### Model Variants and Routing Modes

| Routing Method | Identifier Suffix | Description | Use Case |
| --- | --- | --- | --- |
| Default | (none) | Load balances across providers, prioritizing price | General purpose |
| Nitro | :nitro | Optimizes for throughput and response speed | Time-sensitive applications |
| Floor | :floor | Prioritizes the lowest cost options | Budget-conscious deployments |
| Online | :online | Includes web search results via Exa.ai | Real-time information needs |
| Free | :free | Routes to free model variants only | Experimentation and learning |
| Auto | openrouter/auto | AI-powered model selection using NotDiamond | Automatic optimization |
| Custom | User-defined | Specific provider preferences | Enterprise requirements |

The system automatically falls back to alternative providers when the primary endpoint returns errors or exceeds latency thresholds, improving overall reliability to maintain uptime.[16][17]

#### Provider Variance and Exacto

In October 2025, OpenRouter introduced Provider Variance (Exacto), a feature that addresses the reality that the same model can perform differently across different hosting providers due to differences in quantization, hardware, and serving configurations.[35] Exacto measures provider-level quality for each model and routes requests to the providers that deliver the best results. The March 2026 update, Auto Exacto, extended this further by automatically selecting tool-calling providers for new models on the platform.[36] In April 2026, OpenRouter announced an enhanced version of Auto Exacto that re-evaluates providers every five minutes across throughput, tool-call telemetry, and benchmark scores, switching the default provider as conditions change. For requests that include tools, Auto Exacto is enabled by default.[48]

### Automatic Fallback and Load Balancing

One of OpenRouter's core differentiators is its automatic failover system. When a request to the primary provider fails (due to rate limits, outages, or timeouts), the platform transparently retries the request against an alternative provider hosting the same model. This happens without any code changes on the developer's side. Key properties of the fallback system include:

- Billing only applies for the successful model run, not for failed attempts.
- Partition-aware sorting allows endpoints to be sorted globally across all models rather than always trying the primary model first.
- Performance thresholds deprioritize (rather than exclude) non-compliant providers, so requests still execute even if no provider meets the ideal criteria.
- The `max_price` field differs from other thresholds: it prevents request execution entirely if pricing requirements cannot be met.

## Features and Functionality

### Model Catalog

OpenRouter provides access to over 400 models across multiple tiers and providers:[1][7]

| Provider | Example Models | Use Cases | Pricing Tier |
| --- | --- | --- | --- |
| [OpenAI](/wiki/openai) | GPT-5 Chat, GPT-4o, GPT-4 Turbo, GPT-4o Mini, o1-preview | General purpose, coding, reasoning | Premium to Budget |
| [Anthropic](/wiki/anthropic) | Claude Opus 4, Claude 3.5 Sonnet, Claude 3 Haiku | General reasoning, coding, multilingual | Premium to Budget |
| Google | Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.0 Flash | Multimodal, research, creative tasks | Mid-tier to Budget |
| [Mistral AI](/wiki/mistral_ai) | Mistral Large 2, Mistral Nemo, Codestral | Coding, multilingual, efficiency | Mid-tier |
| [Meta](/wiki/meta) | Llama 3.1 (405B, 70B, 8B), Llama 3.2 | Open-source, text generation | Free to Budget |
| [DeepSeek](/wiki/deepseek) | DeepSeek Coder V2, DeepSeek R1 | Coding, specialized tasks | Free tier available |
| [xAI](/wiki/xai) | Grok | General AI, X platform integration | Premium |
| MiniMax | MiniMax M2.5 | Coding, general purpose | Budget |
| Moonshot AI | Kimi K2, Kimi K2.5 | Coding, reasoning | Budget to Mid-tier |
| Zhipu AI | GLM-5 | General purpose, Chinese language | Budget |
| Perplexity AI | Search-enhanced models | Research, knowledge retrieval | Mid-tier |
| Others | Various [Hugging Face](/wiki/hugging_face) models, community fine-tunes | Specialized tasks, community models | Various |

### Free Models Tier

OpenRouter maintains a selection of completely free models that developers can use at no cost, making the platform accessible for experimentation, learning, and prototyping.[17][20] The free tier operates under the following conditions:

| Condition | Limit |
| --- | --- |
| Users with fewer than 10 credits | 50 free model requests per day |
| Users with 10 or more credits | 1,000 free model requests per day |
| Free model variants (`:free` suffix) | 20 requests per minute rate limit |
| Free Models Router (`openrouter/free`) | Automatically selects a compatible free model |

Free models include variants from [DeepSeek](/wiki/deepseek), [Meta](/wiki/meta) Llama, Devstral Small, and other open-source models. The Free Models Router, introduced in February 2026, automatically selects a free model at random from available options, intelligently filtering for models that support the features the request requires (such as image understanding, [tool use](/wiki/tool_use), and [structured outputs](/wiki/structured_output)).[37]

### OAuth PKCE for Third-Party Applications

OpenRouter supports OAuth 2.0 with Proof Key for Code Exchange (PKCE), enabling third-party applications to authenticate users through a secure single sign-on (SSO) experience.[15] This is particularly important for applications like browser extensions, IDE plugins, and coding assistants that need to access LLMs on behalf of individual users without handling API keys directly.

The OAuth PKCE flow works as follows:

1. **Authorization Request**: The application redirects the user to OpenRouter's `/auth` endpoint with a `callback_url`, an optional `code_challenge` (a base64-encoded SHA-256 hash of a random verifier), and a `code_challenge_method` (typically `S256`).
2. **User Consent**: The user logs in to OpenRouter and authorizes the application.
3. **Code Exchange**: After redirect, the application exchanges the authorization code and the original `code_verifier` via a POST request to `https://openrouter.ai/api/v1/auth/keys`, receiving a user-controlled API key.
4. **API Access**: The obtained key is then used to authenticate subsequent requests to OpenRouter's endpoints.

This approach means that each user pays for their own model usage through their OpenRouter account, freeing the application developer from needing to subsidize API costs. Applications such as Cline, Continue, and various VS Code extensions use this flow to connect users to OpenRouter.[15][22]

### Multimodal Support

The platform supports multimodal inputs where the underlying models allow it:[1]

- Image processing and analysis with vision-capable models
- PDF document parsing with OCR capabilities
- Audio input processing for transcription and analysis
- Structured data extraction from documents
- Embeddings and rerankers for retrieval-augmented generation pipelines

#### Image and Audio Generation

OpenRouter exposes image generation models (including variants from OpenAI, Google, and ByteDance) and audio generation/transcription models through the same chat-completions style interface, with media artifacts returned as URLs or base64 payloads. Pricing is normalized into per-image, per-second, or per-token units depending on the modality, and routing rules (Exacto, ZDR, BYOK) carry over from text models.[43]

#### Video Generation

On April 22, 2026, OpenRouter launched a unified video generation API that brings text-to-video and image-to-video models under the same gateway as its text models.[43][49] The launch supported Sora 2 Pro from [OpenAI](/wiki/openai), Veo 3.1 from Google, Seedance 2.0 and 1.5 from ByteDance, Wan 2.7 and 2.6 from Alibaba, and (added a few days later) Kuaishou's Kling Video O1, with more models scheduled to follow.[43][50]

| Model | Provider | Notable Capability |
| --- | --- | --- |
| Sora 2 Pro | [OpenAI](/wiki/openai) | High-fidelity text-to-video with audio |
| Veo 3.1 | Google | Long-duration cinematic generations |
| Seedance 2.0 / 1.5 | ByteDance | Text- and image-to-video, low cost per second |
| Wan 2.7 / 2.6 | Alibaba | Open-weights lineage with strong motion fidelity |
| Kling Video O1 | Kuaishou | Image-to-video with reference frames |

Video APIs across providers traditionally use different request shapes, parameter names, and billing units. OpenRouter's gateway normalizes these into a single asynchronous job interface: clients submit a prompt and receive a job ID, then poll or stream the result when the video is ready. Common parameters such as resolution, duration, aspect ratio, audio generation, frame images, and reference images are exposed in a consistent shape across every supported model.[43] Co-founder Alex Atallah reported in April 2026 that across image, audio, and video models the platform had served over 100 million media generations with roughly 50% month-over-month growth.[50]

### Web Search Integration

OpenRouter offers web search augmentation through integration with Exa.ai. Appending `:online` to any model identifier enables real-time web search capabilities, with retrieved information injected into the model context with proper citations. This feature is priced at $4 per 1,000 search results.[18]

### Response Healing

Introduced in December 2025, Response Healing is a feature that automatically corrects malformed JSON responses from LLMs before they reach the application.[38] This is particularly valuable for agentic workflows and applications that depend on [structured outputs](/wiki/structured_output), where a single malformed response can break an entire pipeline. According to OpenRouter, Response Healing reduces structured output defects by over 80%.

### Workspaces (Project Environments)

In April 2026, OpenRouter introduced Workspaces, a feature that lets accounts organize projects into separate environments, each with its own API keys, routing defaults, guardrails, and observability dashboards.[51] Typical use cases include separating development, staging, and production traffic, isolating experiments from billable workloads, and giving each customer or internal team its own keys, spend caps, alerts, and activity logs.

Workspaces can be created from the dashboard via the workspace picker or programmatically through the management API. Each workspace has independent credit balances, rate limits, and provider preferences, and members can be invited with role-based permissions. The feature is part of OpenRouter's broader push into enterprise infrastructure, alongside SAML single sign-on, in-region routing, and data residency controls offered on Enterprise plans.[51][52]

### Advanced Capabilities

- **Structured Outputs**: JSON schema enforcement via `response_format` parameter for predictable parsing[14]
- **Tool/Function Calling**: Support for parallel tool calls with `tools` and `tool_choice` parameters[14]
- **Context Caching**: Optimization for repeated prompts to reduce costs and latency[1]
- **[Reasoning](/wiki/reasoning) Tokens**: Reveals model thinking processes for complex tasks with configurable token budgets[1]
- **Community Leaderboards**: Public rankings showing model usage statistics, performance metrics, and user votes[7][8]
- **Benchmark Scores**: As of February 2026, every model page displays industry-standard benchmark scores covering programming, math, science, and long-context reasoning[37]
- **Effective Pricing Tab**: Model pages show actual per-provider pricing including tiered pricing, providing full cost transparency[37]

### Model Comparison Tool

OpenRouter provides an interactive model comparison tool at `openrouter.ai/compare` where developers can compare AI models side by side across key metrics including price per token, context length, latency, uptime, and throughput.[39] This allows developers to make informed decisions about which models to use for specific tasks before writing any code.

### Analytics and Observability

OpenRouter provides comprehensive analytics dashboards showing:[13]

- [Token](/wiki/token) consumption and costs across models
- Latency metrics and performance trends
- Error rates and failure analysis
- Usage patterns by application, team, or API key
- Real-time monitoring and status updates

The platform also offers programmatic access to usage data via the `/api/v1/generation` endpoint for integration with custom monitoring systems.[13]

## Rankings and the LLM Leaderboard

OpenRouter maintains a public LLM leaderboard at `openrouter.ai/rankings` that tracks model popularity and performance based on real usage data from millions of developers.[8][7] Unlike synthetic benchmarks, OpenRouter's rankings reflect actual production usage patterns, making them a valuable signal for which models developers are choosing in practice.

The leaderboard gained notable visibility when [Andrej Karpathy](/wiki/andrej_karpathy) highlighted OpenRouter's LLM rankings during a talk at Y Combinator Startup School on June 17, 2025.[7] The rankings have become an informal industry barometer for model adoption trends.

### State of AI Report

In December 2025, OpenRouter partnered with Andreessen Horowitz (a16z) to publish the "State of AI" report, an empirical study analyzing metadata from over 100 trillion tokens processed through the platform.[40] Key findings included:

| Finding | Detail |
| --- | --- |
| Agentic inference | The fastest-growing behavior on OpenRouter; developers building multi-step workflows rather than single prompts |
| Programming dominance | Programming grew from 11% to over 50% of total platform usage throughout 2025 |
| Open-source momentum | Models like [DeepSeek](/wiki/deepseek) R1 and Kimi K2 gaining market share through cost efficiency |
| Model personality effect | User retention correlates more strongly with model "personality" than with benchmark rankings |
| Breakthrough switching | New model capabilities trigger provider switches with low rates of switching back |
| Chinese model growth | By February 2026, Chinese-developed models commanded 61% of total token consumption on the platform[29] |

The report emphasized that competitive advantage in AI is shifting from pure accuracy metrics to orchestration, control, and reliable agent operation.

## Pricing and Billing

### Cost Structure

OpenRouter uses a credit-based prepaid system with pass-through pricing, meaning model costs shown in the catalog match what providers charge directly.[17][7][30] The platform does not apply hidden markups on inference pricing.

| Component | Details |
| --- | --- |
| Model Costs | Direct per-token charges from underlying providers, billed separately for input and output tokens |
| Platform Fee | Approximately 5% of inference costs as primary revenue[7] |
| Credit Purchase Fee (Card) | 5.5% fee (minimum $0.80) for credit card purchases[17] |
| Credit Purchase Fee (Crypto) | 5.0% for USDC cryptocurrency payments[17] |
| Monthly Fee | None; no subscriptions or minimum commitments[30] |

Example model pricing (as of early 2026):[1][8][29]

| Model | Input (per million tokens) | Output (per million tokens) |
| --- | --- | --- |
| Claude Opus 4 | $5.00 | $25.00 |
| GPT-5 Chat | $1.25 | $10.00 |
| Claude 3.5 Sonnet | $3.00 | $15.00 |
| GPT-4o | $2.50 | $10.00 |
| Gemini 2.5 Flash | $0.15 | $0.60 |
| Mistral Large 2 | $2.00 | $6.00 |
| MiniMax M2.5 | $0.30 | $1.10 |
| GPT-4o Mini | $0.15 | $0.60 |
| DeepSeek Coder V2 | $0.27 | $1.10 |
| Llama 3.1 8B | Free (limited) | Free (limited) |

Credits are purchased in bundles (for example $10 for 10 credits) with no expiration date, and volume discounts are available for enterprise customers.[8]

### BYOK (Bring Your Own Keys)

For enterprise users with existing provider relationships, OpenRouter supports BYOK:[19]

- First 1 million BYOK requests per month are free
- Subsequent requests incur a 5% fee of the normal OpenRouter cost
- Support for keys from Azure, AWS Bedrock, Google Vertex AI, and other providers
- Automatic fallback to OpenRouter credits if user keys encounter issues

## Privacy and Security

### Data Retention

OpenRouter's default privacy configuration includes:[17]

- [Prompt](/wiki/prompt) and response content storage disabled by default
- Only request metadata (timestamps, model used, token counts, latency) retained
- Opt-in prompt logging available, sometimes with associated discounts for model improvement

### Zero Data Retention (ZDR) Routing

Users can configure requests to route only to providers with verified zero data retention policies. This feature may limit model availability and potentially affect latency or pricing but provides enhanced privacy guarantees for sensitive applications.[1] In October 2025, OpenRouter published an analysis examining whether implicit caching qualifies as zero data retention across providers, bringing greater transparency to the nuances of ZDR claims.[35]

### Compliance

As of July 2025, OpenRouter has achieved SOC 2 Type I compliance and maintains a public trust portal detailing security practices and compliance status.[21] The platform implements Cloudflare DDoS protection and rate limiting for security.[20]

## Integration and Ecosystem

### SDK and Framework Support

OpenRouter provides official support for numerous SDKs and frameworks:[22]

- [OpenAI](/wiki/openai) SDK (Python and TypeScript): drop-in compatible by changing the base URL
- [LangChain](/wiki/langchain) (Python and JavaScript)
- Vercel AI SDK
- PydanticAI
- [LlamaIndex](/wiki/llamaindex)
- Langfuse for observability
- Continue (VS Code extension)
- Cline (AI coding assistant)
- Zapier for workflow automation
- Home Assistant for home automation[23]
- IntelliJ IDEA Plugin (with OAuth PKCE support)[41]

### Third-Party Integrations

- Cloudflare AI Gateway for additional routing and caching[24]
- NovelCrafter for creative writing applications
- Various IDE extensions
- NVIDIA NeMo Data Designer for license-safe synthetic data generation (December 2025 partnership)[42]

### Developer Tools

The platform offers various tools for developers:[25]

- Interactive Request Builder for generating API requests
- Public leaderboards showing model usage statistics
- Real-time monitoring and analytics dashboards
- GitHub repository with example code and integrations[26]
- Discord community for developer support[8]

## Competition and Alternatives

OpenRouter operates in the rapidly growing LLM gateway and API aggregation market. As the market has matured, several alternatives have emerged, each with different strengths.[6][28]

### Direct API Access

The most straightforward alternative to OpenRouter is using model providers' APIs directly. Providers like [OpenAI](/wiki/openai), [Anthropic](/wiki/anthropic), Google, and [Mistral AI](/wiki/mistral_ai) all offer their own API endpoints. Direct access avoids any intermediary overhead or margin, but requires developers to maintain separate integrations, billing accounts, and failover logic for each provider. For teams that rely on only one or two models, direct access may be simpler. For teams working across many models, the integration burden grows quickly.

### LLM Gateway Competitors

| Platform | Type | Models | Key Differentiator | Pricing |
| --- | --- | --- | --- | --- |
| **OpenRouter** | Managed SaaS | 400+ | Largest model marketplace; unified billing; pass-through pricing | 5% platform fee; pay-as-you-go |
| **Portkey** | Managed SaaS | 1,600+ | Production observability, guardrails, governance, and [prompt engineering](/wiki/prompt_engineering) management | Free plan; $49/month Pro |
| **LiteLLM** | Open-source / Self-hosted | 100+ | Full infrastructure control; self-hosted with no data leaving your servers | Free (open-source); custom enterprise |
| **Eden AI** | Managed SaaS | 500+ | Combines LLM routing with OCR, translation, speech, moderation, and other specialized AI | Pay-as-you-go |
| **Helicone** | Managed SaaS | 100+ | Observability-focused with session tracking and prompt management | Free hobby tier; $79/month Pro |
| **Requesty** | Managed SaaS | 400+ | Lightweight multi-provider routing with similar pass-through pricing model | Free plan; 5% markup |
| **Kong AI Gateway** | Self-hosted / Enterprise | Varies | Enterprise governance policies and existing Kong API gateway ecosystem integration | Custom enterprise pricing |
| **Vercel AI Gateway** | Managed SaaS | Varies | Optimized for the Next.js and Vercel ecosystem with edge deployment | $5 monthly credits; usage-based |

### Key Differentiators

OpenRouter's primary competitive advantages include its large model catalog, the simplicity of its unified billing system, its established developer community of over 8 million users, and its reputation as a neutral marketplace without allegiance to any single model provider.[53] Its public leaderboard and usage data also create a network effect: more developers using the platform generates more data about model quality, which attracts more developers.

LiteLLM appeals to teams that need self-hosted infrastructure for compliance or data sovereignty reasons, since all API keys stay on the user's own servers and requests go directly to providers. Portkey targets production engineering teams that need deeper observability, guardrails, and governance controls beyond what OpenRouter provides.

## 2025-2026 Developments

The period from late 2025 through mid-2026 saw rapid evolution for both OpenRouter and the broader LLM ecosystem:

- **Agentic [Inference](/wiki/inference) Growth**: The State of AI report highlighted agentic workflows as the fastest-growing use pattern on OpenRouter, with developers increasingly building systems where models act in extended, multi-step sequences rather than responding to single prompts.[40]
- **Chinese Model Surge**: By February 2026, Chinese-developed AI models (MiniMax M2.5, Moonshot's Kimi K2.5, Zhipu AI's GLM-5) accounted for 61% of total token consumption on the platform. MiniMax M2.5 alone consumed 2.45 trillion tokens in a single week, driven partly by promotional free access from developer tools like Kilo Code and Cline.[29]
- **Response Healing**: Launched in December 2025, this feature automatically corrects malformed structured outputs, reducing defects by over 80%.[38]
- **[Benchmarks](/wiki/benchmarks) on Model Pages**: As of February 2026, every model page displays standardized benchmark scores for programming, math, science, and long-context reasoning tasks.[37]
- **Free Models Router**: Introduced in February 2026, `openrouter/free` provides a zero-cost entry point that intelligently selects compatible free models based on request requirements.[37]
- **Auto Exacto**: Released in March 2026, this feature extends the Exacto quality routing system by automatically selecting optimal tool-calling providers for newly added models.[36]
- **Series B Financing**: In May 2026, OpenRouter raised a $113 million Series B led by Alphabet's CapitalG at an approximately $1.3 billion valuation, with participation from NVentures, ServiceNow, MongoDB, Snowflake, and Databricks venture arms.[53][54]
- **Platform Scale**: By May 2026, OpenRouter processed roughly 25 trillion tokens per week, about 100 trillion per month, and served more than 8 million developers globally, with over 50% of usage originating from outside the United States.[53][54]

## Business Model and Market Position

### Revenue and Growth

According to company statements and third-party reporting, OpenRouter's business model centers on:[7][47]

- A roughly 5% cut of inference costs as the primary revenue source
- Annualized revenue of approximately $50 million as of early 2026, up from about $19 million at the end of 2025[47][56]
- Projected $25 billion AI inference market in 2025
- $15 billion addressable market from third-party applications

Co-founder Chris Clark stated: "We believe that inference costs will eclipse salaries as the dominant operating expense for most knowledge-based companies over the next five to 10 years."[7]

The platform has facilitated significant market transparency in AI model pricing and performance. Market share data from August 2025 reveals Google (22.5%) and [Anthropic](/wiki/anthropic) (22.3%) as leading providers, followed by [OpenAI](/wiki/openai) and various open-source providers.[27]

### User Base

As of mid-2026, OpenRouter serves:[44][53][54]

- Over 8 million developers worldwide
- More than 250,000 applications built on the platform
- Roughly 25 trillion tokens processed per week, about 100 trillion per month
- Notable integrations including Cline, Continue, and native Visual Studio Code support
- Over 50% of usage from outside the United States

The platform has received endorsements from prominent figures in the AI community, including Andrej Karpathy, who highlighted its LLM rankings during a talk at Y Combinator Startup School on June 17, 2025.[7]

## Principles and Philosophy

OpenRouter's stated principles emphasize a multi-model and multi-provider future for AI development.[3] The company highlights several key value propositions:

- Model diversity and choice for developers
- Price transparency without hidden markups on inference costs
- Reliability through automatic failover and load balancing
- Simplified integration across multiple providers
- Support for both commercial and open-source models
- Democratization of AI access through unified interfaces
- Vendor neutrality, allowing developers to avoid lock-in to any single provider

## See Also

- API gateway
- [Large language model](/wiki/large_language_model)
- [OpenAI](/wiki/openai)
- [Anthropic](/wiki/anthropic)
- Google AI
- [AI agents](/wiki/ai_agents)
- [LangChain](/wiki/langchain)
- [Prompt engineering](/wiki/prompt_engineering)
- [Tool use](/wiki/tool_use)
- [Structured output](/wiki/structured_output)
- [MLOps](/wiki/mlops)

## References

[1] OpenRouter Models Documentation. https://openrouter.ai/docs/guides/overview/models

[2] OpenRouter. Puter Developer Encyclopedia. https://developer.puter.com/encyclopedia/openrouter/

[3] Limitless: An AI Podcast. "OpenRouter: The Only AI Tool You'll Ever Need | Founder Alex Atallah." Bankless. https://limitless.bankless.com/episodes/openrouter-the-only-ai-tool-youll-ever-need-founder-alex-atallah/transcript

[4] OpenRouter official website. https://openrouter.ai/

[5] OpenRouter FAQ. https://openrouter.ai/docs/faq

[6] "Top 7 OpenRouter Alternatives in 2026." Eden AI. https://www.edenai.co/post/best-alternatives-to-openrouter

[7] "Investing in OpenRouter, the One API for All AI." Menlo Ventures. https://menlovc.com/perspective/investing-in-openrouter-the-one-api-for-all-ai/

[8] OpenRouter Pricing. https://openrouter.ai/pricing

[9] Alex Atallah. Crunchbase. https://www.crunchbase.com/person/alex-atallah-912d

[10] "OpenSea co-founder Alex Atallah raises $40 million for AI startup OpenRouter." The Block. https://www.theblock.co/post/360093/opensea-co-founder-alex-atallah-raises-40-million-for-ai-startup-openrouter

[11] "OpenSea co-founder Alex Atallah raises $40 million for AI startup OpenRouter." RootData. https://www.rootdata.com/news/127113

[12] "OpenSea co-founder Alex Atallah raises $40 million for AI startup OpenRouter." Coinhub Exchange. https://coinhubexchange.com/opensea-co-founder-alex-atallah-raises-40-million-for-ai-startup-openrouter/

[13] OpenRouter API Authentication Documentation. https://openrouter.ai/docs/api/reference/authentication

[14] OpenRouter Supported Parameters. https://openrouter.ai/docs/guides/overview/models

[15] OpenRouter OAuth PKCE Documentation. https://openrouter.ai/docs/guides/overview/auth/oauth

[16] OpenRouter Provider Routing Documentation. https://openrouter.ai/docs/guides/routing/provider-selection

[17] OpenRouter FAQ and Pricing. https://openrouter.ai/docs/faq

[18] OpenRouter Web Search (Online) Documentation. https://openrouter.ai/docs/guides/routing/model-variants/free

[19] OpenRouter BYOK Documentation. https://openrouter.ai/docs/faq

[20] OpenRouter Free Tier Documentation. https://openrouter.ai/docs/guides/routing/routers/free-models-router

[21] OpenRouter Trust Portal. https://openrouter.ai/

[22] OpenRouter SDK Documentation. https://openrouter.ai/docs

[23] OpenRouter Home Assistant Integration. https://openrouter.ai/docs

[24] Cloudflare AI Gateway. https://openrouter.ai/docs

[25] OpenRouter Developer Tools. https://openrouter.ai/docs

[26] OpenRouter on GitHub. https://github.com/OpenRouterTeam

[27] OpenRouter Market Share Data (August 2025). https://openrouter.ai/rankings

[28] OpenRouter Company Information. https://openrouter.ai/

[29] "Chinese AI Models Hit 61% Market Share On OpenRouter." Dataconomy. February 25, 2026. https://dataconomy.com/2026/02/25/chinese-ai-models-hit-61-market-share-on-openrouter/

[30] "OpenRouter API Pricing 2026." ZenMux. https://zenmux.ai/blog/openrouter-api-pricing-2026-full-breakdown-of-rates-tiers-and-usage-costs

[31] Alex Atallah personal website. https://alexatallah.com/

[32] "Quasar Alpha and Optimus Alpha Reveal." OpenRouter Announcements. https://openrouter.ai/announcements/quasar-alpha-and-optimus-alpha-reveal

[33] "Quasar Alpha: The Mysterious New Model Likely From OpenAI." 16x Prompt. https://prompt.16x.engineer/blog/quasar-alpha-openai-stealth-model

[34] "OpenSea Co-Founder Alex Atallah Resigns To Focus On New Ventures." NFT Evening. https://nftevening.com/open-sea-cofounder-alex-atallah-resigns-to-focus-on-new-ventures/

[35] "Provider Variance: Exacto" and "Implicit Caching Analysis." OpenRouter Announcements (October 2025). https://openrouter.ai/announcements

[36] "Auto Exacto: Adaptive Quality Routing." OpenRouter Announcements (March 2026). https://openrouter.ai/announcements

[37] "February Release Spotlight." OpenRouter Announcements (February 2026). https://openrouter.ai/announcements/february-release-spotlight

[38] "Response Healing." OpenRouter Announcements (December 2025). https://openrouter.ai/announcements

[39] OpenRouter Model Comparison Tool. https://openrouter.ai/compare

[40] "State of AI: An Empirical 100 Trillion Token Study with OpenRouter." Andreessen Horowitz. https://a16z.com/state-of-ai/

[41] OpenRouter IntelliJ Plugin. JetBrains Marketplace. https://plugins.jetbrains.com/plugin/28520-openrouter

[42] "Distillable Models and NeMo Data Designer." OpenRouter Announcements (December 2025). https://openrouter.ai/announcements

[43] "Announcing Video Generation." OpenRouter Announcements, April 22, 2026. https://openrouter.ai/announcements/video-generation

[44] "OpenRouter Review 2026." AI Agents List. https://aiagentslist.com/agents/openrouter

[45] Stephanie Palazzolo, Julia Hornstein, Kevin McLaughlin. "OpenRouter in Talks to Raise $120M at $1.3B Valuation Led by CapitalG." The Information, April 2026. https://www.theinformation.com/

[46] "OpenRouter Helps Companies Pick the Best AI for the Job, and Could Be Worth $1.3 Billion." Inc. Magazine, April 2026. https://www.inc.com/ben-sherry/openrouter-helps-companies-pick-the-best-ai-for-the-job-and-could-be-worth-1-3-billion/91325983

[47] "OpenRouter Revenue, Valuation and Funding." Sacra Equity Research, 2026. https://sacra.com/c/openrouter/

[48] "Auto Exacto Enhanced Routing Update." OpenRouter Announcements, April 15, 2026. https://openrouter.ai/announcements

[49] "OpenRouter Launches Video Generation API, Integrating Sora 2, Veo 3.1, and Seedance." KuCoin News, April 22, 2026. https://www.kucoin.com/news/flash/openrouter-launches-video-generation-api-integrating-sora-2-veo-3-1-seedance

[50] Alex Atallah on X (formerly Twitter). Post on media generation growth, April 22, 2026. https://x.com/alexatallah/status/2044500778086228278

[51] "Introducing Workspaces." OpenRouter Announcements, April 2026. https://openrouter.ai/announcements/introducing-workspaces

[52] OpenRouter Enterprise. https://openrouter.ai/enterprise

[53] "OpenRouter Raises $113M Series B." OpenRouter Announcements, May 26, 2026. https://openrouter.ai/announcements/series-b

[54] "OpenRouter Raises $113 Million CapitalG-led Series B as Weekly Volume Explodes to 25T Tokens." Business Wire, May 26, 2026. https://www.businesswire.com/news/home/20260526953416/en/OpenRouter-Raises-$113-Million-CapitalG-led-Series-B-as-Weekly-Volume-Explodes-to-25T-Tokens

[55] Marina Temkin. "OpenRouter more than doubles valuation to $1.3B in a year." TechCrunch, May 26, 2026. https://techcrunch.com/2026/05/26/openrouter-more-than-doubles-valuation-to-1-3b-in-a-year/

[56] "OpenRouter at $100M GMV." Sacra Research, 2026. https://sacra.com/research/openrouter-100m-gmv/

## External Links

- [OpenRouter Documentation](https://openrouter.ai/docs)
- [OpenRouter on GitHub](https://github.com/OpenRouterTeam)
- [OpenRouter on X (Twitter)](https://x.com/openrouterai)
- [OpenRouter Discord Community](https://openrouter.ai/discord)
- [OpenRouter LLM Rankings](https://openrouter.ai/rankings)
- [OpenRouter Model Comparison](https://openrouter.ai/compare)
- [State of AI Report (a16z + OpenRouter)](https://a16z.com/state-of-ai/)
