Template:Infobox software
OpenRouter is a unified API gateway platform and marketplace that provides developers with access to over 400 large language models (LLMs) from multiple providers through a single, standardized interface.[1][2] Founded in early 2023 by Alex Atallah, co-founder and former CTO of OpenSea, and engineer Louis Vichy, the platform aims to simplify the integration and optimization of AI models while providing price transparency, reliability, and consolidated billing.[3][4] Often described as an "LLM router" or "API aggregator," OpenRouter sits between application developers and model providers, handling request routing, automatic failover, load balancing, and billing through a single API endpoint compatible with the OpenAI Chat API specification.[5]
OpenRouter functions as an intermediary service and the first LLM marketplace, normalizing access to various AI models through a consistent API schema compatible with OpenAI's Chat API.[5] This allows developers to switch between different LLM providers without changing their code implementation, addressing the problem of "API sprawl" where developers would otherwise need to maintain separate integrations for each model provider.[3][5]
The platform operates as a remote-first company headquartered in New York City, with additional offices in San Francisco, California.[2][6] As of early 2026, the company has approximately 8 employees and serves over 5 million developers worldwide.[7][29]
The service supports a growing catalog of over 400 models from 60+ providers including OpenAI, Anthropic, Google, Mistral AI, Meta, DeepSeek, xAI, MiniMax, Moonshot AI, Zhipu AI, and various open-source implementations.[1][7] The platform adds approximately 15 to 25 milliseconds of edge latency to end-to-end inference for most requests while providing automatic failover, load balancing, and unified billing.[2][8][30]
A core value proposition of OpenRouter is vendor neutrality. Developers can experiment with models from competing providers side by side without managing separate accounts, API keys, or billing relationships. This has made the platform especially popular among indie developers, AI startups, and teams building agentic workflows that need to call different models for different subtasks within a single pipeline.
OpenRouter was founded in February/March 2023 by Alex Atallah, shortly after witnessing the emergence of open-source LLMs like Meta's LLaMA.[3] The initial inspiration came after Atallah observed models like Stanford's Alpaca, which demonstrated that smaller teams could create competitive AI models with minimal resources. This suggested a future ecosystem with numerous specialized models, potentially requiring a marketplace to effectively navigate them.[3]
Atallah, who previously co-founded the $14 billion NFT marketplace OpenSea and served as Director of Product & Engineering at Kaggle, partnered with engineer Louis Vichy and one of his collaborators from the browser extension framework Plasmo to launch OpenRouter.[3][9][10] Before building OpenRouter, Atallah also created Window AI, a browser extension that connects LLMs to the web, which served as an early experiment in providing unified access to multiple language models.[31]
In its early stages, OpenRouter processed approximately 10 trillion tokens annually. By mid-2025, it had scaled to over 100 trillion tokens per year, representing a 10x growth.[7] By February 2026, weekly token consumption on the platform reached approximately 12.1 trillion tokens, a 12.7-fold year-over-year increase.[29] A key milestone was serving as the exclusive launch partner for OpenAI's first coding model, GPT-4.1. The model initially appeared on the platform under the codename "Quasar Alpha" in April 2025, and was later revealed to be a stealth endpoint for an early version of GPT-4.1.[32][33]
In June 2025, OpenRouter announced back-to-back seed and Series A rounds totaling $40 million, achieving a valuation of approximately $500 million:[10][11][12]
| Round | Date | Amount | Lead Investor | Other Participants | Valuation |
|---|---|---|---|---|---|
| Seed | June 2025 | $12.5 million | Andreessen Horowitz | Sequoia Capital, Soma Capital | - |
| Series A | June 2025 | $28 million | Menlo Ventures | Sequoia Capital, Transpose Platform Management | ~$500 million |
| Total | $40.5 million |
Monthly customer spending through the platform grew from $800,000 in October 2024 to approximately $8 million in May 2025, a ten-fold increase in seven months.[7] By late 2025, annualized inference spend routed through the platform exceeded $100 million, up from roughly $10 million in late 2024.[10]
| Date | Event |
|---|---|
| February/March 2023 | OpenRouter founded by Alex Atallah and Louis Vichy |
| 2023 | Early growth phase; platform processes ~10 trillion tokens annually |
| October 2024 | Monthly customer spending reaches $800,000 |
| April 2025 | Quasar Alpha (stealth GPT-4.1) launched exclusively on OpenRouter |
| May 2025 | Monthly customer spending reaches ~$8 million |
| June 2025 | $40.5 million raised in combined Seed and Series A at ~$500 million valuation |
| October 2025 | Provider Variance (Exacto) feature launched to address performance differences across providers |
| December 2025 | Response Healing feature introduced; State of AI report published with a16z; ranked #1 on Brex's fastest-growing AI infrastructure list |
| January 2026 | Fast LLM prioritization, auto router customization, and SDK skill loading released |
| February 2026 | Benchmarks added to model pages; free model router launched; weekly token volume reaches 12.1 trillion |
| March 2026 | Auto Exacto adaptive quality routing released |
Alex Atallah is the CEO and co-founder of OpenRouter. He is a Stanford University alumnus and also studied cybersecurity at the University of Oxford.[31][34] Before founding OpenRouter, Atallah had a notable career in technology:
| Role | Organization | Period | Notes |
|---|---|---|---|
| Cybersecurity Engineer | Palantir | Early career | Built cybersecurity products |
| CTO | hostess.fm | Pre-2014 | Music startup acquired by Beatport in 2014 |
| Director of Product & Engineering | Kaggle | Pre-2018 | Data science competition platform (acquired by Google) |
| Co-founder and CTO | OpenSea | 2018-2022 | NFT marketplace reaching $14 billion valuation |
| Creator | Window AI | 2023 | Browser extension connecting LLMs to the web |
| CEO and Co-founder | OpenRouter | 2023-present | LLM router and marketplace |
Atallah stepped down from OpenSea in July 2022, stating he wanted to "build something from zero to one," while remaining on the company's board.[34] His experience building marketplace infrastructure at OpenSea directly informed the design of OpenRouter as a model marketplace.
Notable investors include:[28][7]
OpenRouter provides a single API endpoint at https://openrouter.ai/api/v1 that implements the OpenAI API specification for /completions and /chat/completions endpoints.[13] The platform normalizes request/response schemas across providers to reduce per-vendor integration work, while still allowing provider-specific options to pass through when needed.[14] Because it follows the OpenAI SDK format, most applications that already use the OpenAI Python or TypeScript SDK can switch to OpenRouter by changing only the base URL and API key.
Key technical features include:
The API supports comprehensive parameters for model configuration:[14]
| Parameter | Description | Range/Values |
|---|---|---|
| temperature | Controls response randomness | 0-2 |
| max_tokens | Maximum response length | Model-dependent |
| top_p | Nucleus sampling threshold | 0-1 |
| frequency_penalty | Reduces repetition of frequent tokens | -2 to 2 |
| presence_penalty | Reduces repetition of any used tokens | -2 to 2 |
| stream | Enables SSE streaming | true/false |
| tools | Function calling configuration | JSON schema |
| response_format | Enforces structured output | JSON schema |
OpenRouter employs intelligent routing algorithms that consider multiple factors when directing requests to providers.[16] The default load balancing strategy operates in three tiers:
Developers can customize routing through the provider object in API requests:
| Configuration | Purpose |
|---|---|
order | Specify preferred providers in sequence |
sort | Prioritize by price, throughput, or latency |
allow_fallbacks | Enable or disable backup providers |
only / ignore | Whitelist or blacklist specific providers |
max_price | Set cost ceiling per request |
data_collection | Set to "deny" to exclude providers that log training data |
zdr | Set to true for zero data retention enforcement |
require_parameters | Route only to providers supporting all request parameters |
| Routing Method | Identifier Suffix | Description | Use Case |
|---|---|---|---|
| Default | (none) | Load balances across providers, prioritizing price | General purpose |
| Nitro | :nitro | Optimizes for throughput and response speed | Time-sensitive applications |
| Floor | :floor | Prioritizes the lowest cost options | Budget-conscious deployments |
| Online | :online | Includes web search results via Exa.ai | Real-time information needs |
| Free | :free | Routes to free model variants only | Experimentation and learning |
| Auto | openrouter/auto | AI-powered model selection using NotDiamond | Automatic optimization |
| Custom | User-defined | Specific provider preferences | Enterprise requirements |
The system automatically falls back to alternative providers when the primary endpoint returns errors or exceeds latency thresholds, improving overall reliability to maintain uptime.[16][17]
In October 2025, OpenRouter introduced Provider Variance (Exacto), a feature that addresses the reality that the same model can perform differently across different hosting providers due to differences in quantization, hardware, and serving configurations.[35] Exacto measures provider-level quality for each model and routes requests to the providers that deliver the best results. The March 2026 update, Auto Exacto, extended this further by automatically selecting tool-calling providers for new models on the platform.[36]
One of OpenRouter's core differentiators is its automatic failover system. When a request to the primary provider fails (due to rate limits, outages, or timeouts), the platform transparently retries the request against an alternative provider hosting the same model. This happens without any code changes on the developer's side. Key properties of the fallback system include:
max_price field differs from other thresholds: it prevents request execution entirely if pricing requirements cannot be met.OpenRouter provides access to over 400 models across multiple tiers and providers:[1][7]
| Provider | Example Models | Use Cases | Pricing Tier |
|---|---|---|---|
| OpenAI | GPT-5 Chat, GPT-4o, GPT-4 Turbo, GPT-4o Mini, o1-preview | General purpose, coding, reasoning | Premium to Budget |
| Anthropic | Claude Opus 4, Claude 3.5 Sonnet, Claude 3 Haiku | General reasoning, coding, multilingual | Premium to Budget |
| Gemini 2.5 Pro, Gemini 2.5 Flash, Gemini 2.0 Flash | Multimodal, research, creative tasks | Mid-tier to Budget | |
| Mistral AI | Mistral Large 2, Mistral Nemo, Codestral | Coding, multilingual, efficiency | Mid-tier |
| Meta | Llama 3.1 (405B, 70B, 8B), Llama 3.2 | Open-source, text generation | Free to Budget |
| DeepSeek | DeepSeek Coder V2, DeepSeek R1 | Coding, specialized tasks | Free tier available |
| xAI | Grok | General AI, X platform integration | Premium |
| MiniMax | MiniMax M2.5 | Coding, general purpose | Budget |
| Moonshot AI | Kimi K2, Kimi K2.5 | Coding, reasoning | Budget to Mid-tier |
| Zhipu AI | GLM-5 | General purpose, Chinese language | Budget |
| Perplexity AI | Search-enhanced models | Research, knowledge retrieval | Mid-tier |
| Others | Various Hugging Face models, community fine-tunes | Specialized tasks, community models | Various |
OpenRouter maintains a selection of completely free models that developers can use at no cost, making the platform accessible for experimentation, learning, and prototyping.[17][20] The free tier operates under the following conditions:
| Condition | Limit |
|---|---|
| Users with fewer than 10 credits | 50 free model requests per day |
| Users with 10 or more credits | 1,000 free model requests per day |
Free model variants (:free suffix) | 20 requests per minute rate limit |
Free Models Router (openrouter/free) | Automatically selects a compatible free model |
Free models include variants from DeepSeek, Meta Llama, Devstral Small, and other open-source models. The Free Models Router, introduced in February 2026, automatically selects a free model at random from available options, intelligently filtering for models that support the features the request requires (such as image understanding, tool use, and structured outputs).[37]
OpenRouter supports OAuth 2.0 with Proof Key for Code Exchange (PKCE), enabling third-party applications to authenticate users through a secure single sign-on (SSO) experience.[15] This is particularly important for applications like browser extensions, IDE plugins, and coding assistants that need to access LLMs on behalf of individual users without handling API keys directly.
The OAuth PKCE flow works as follows:
/auth endpoint with a callback_url, an optional code_challenge (a base64-encoded SHA-256 hash of a random verifier), and a code_challenge_method (typically S256).code_verifier via a POST request to https://openrouter.ai/api/v1/auth/keys, receiving a user-controlled API key.This approach means that each user pays for their own model usage through their OpenRouter account, freeing the application developer from needing to subsidize API costs. Applications such as Cline, Continue, and various VS Code extensions use this flow to connect users to OpenRouter.[15][22]
The platform supports multimodal inputs where the underlying models allow it:[1]
OpenRouter offers web search augmentation through integration with Exa.ai. Appending :online to any model identifier enables real-time web search capabilities, with retrieved information injected into the model context with proper citations. This feature is priced at $4 per 1,000 search results.[18]
Introduced in December 2025, Response Healing is a feature that automatically corrects malformed JSON responses from LLMs before they reach the application.[38] This is particularly valuable for agentic workflows and applications that depend on structured outputs, where a single malformed response can break an entire pipeline. According to OpenRouter, Response Healing reduces structured output defects by over 80%.
response_format parameter for predictable parsing[14]tools and tool_choice parameters[14]OpenRouter provides an interactive model comparison tool at openrouter.ai/compare where developers can compare AI models side by side across key metrics including price per token, context length, latency, uptime, and throughput.[39] This allows developers to make informed decisions about which models to use for specific tasks before writing any code.
OpenRouter provides comprehensive analytics dashboards showing:[13]
The platform also offers programmatic access to usage data via the /api/v1/generation endpoint for integration with custom monitoring systems.[13]
OpenRouter maintains a public LLM leaderboard at openrouter.ai/rankings that tracks model popularity and performance based on real usage data from millions of developers.[8][7] Unlike synthetic benchmarks, OpenRouter's rankings reflect actual production usage patterns, making them a valuable signal for which models developers are choosing in practice.
The leaderboard gained notable visibility when Andrej Karpathy highlighted OpenRouter's LLM rankings during a talk at Y Combinator Startup School on June 17, 2025.[7] The rankings have become an informal industry barometer for model adoption trends.
In December 2025, OpenRouter partnered with Andreessen Horowitz (a16z) to publish the "State of AI" report, an empirical study analyzing metadata from over 100 trillion tokens processed through the platform.[40] Key findings included:
| Finding | Detail |
|---|---|
| Agentic inference | The fastest-growing behavior on OpenRouter; developers building multi-step workflows rather than single prompts |
| Programming dominance | Programming grew from 11% to over 50% of total platform usage throughout 2025 |
| Open-source momentum | Models like DeepSeek R1 and Kimi K2 gaining market share through cost efficiency |
| Model personality effect | User retention correlates more strongly with model "personality" than with benchmark rankings |
| Breakthrough switching | New model capabilities trigger provider switches with low rates of switching back |
| Chinese model growth | By February 2026, Chinese-developed models commanded 61% of total token consumption on the platform[29] |
The report emphasized that competitive advantage in AI is shifting from pure accuracy metrics to orchestration, control, and reliable agent operation.
OpenRouter uses a credit-based prepaid system with pass-through pricing, meaning model costs shown in the catalog match what providers charge directly.[17][7][30] The platform does not apply hidden markups on inference pricing.
| Component | Details |
|---|---|
| Model Costs | Direct per-token charges from underlying providers, billed separately for input and output tokens |
| Platform Fee | Approximately 5% of inference costs as primary revenue[7] |
| Credit Purchase Fee (Card) | 5.5% fee (minimum $0.80) for credit card purchases[17] |
| Credit Purchase Fee (Crypto) | 5.0% for USDC cryptocurrency payments[17] |
| Monthly Fee | None; no subscriptions or minimum commitments[30] |
Example model pricing (as of early 2026):[1][8][29]
| Model | Input (per million tokens) | Output (per million tokens) |
|---|---|---|
| Claude Opus 4 | $5.00 | $25.00 |
| GPT-5 Chat | $1.25 | $10.00 |
| Claude 3.5 Sonnet | $3.00 | $15.00 |
| GPT-4o | $2.50 | $10.00 |
| Gemini 2.5 Flash | $0.15 | $0.60 |
| Mistral Large 2 | $2.00 | $6.00 |
| MiniMax M2.5 | $0.30 | $1.10 |
| GPT-4o Mini | $0.15 | $0.60 |
| DeepSeek Coder V2 | $0.27 | $1.10 |
| Llama 3.1 8B | Free (limited) | Free (limited) |
Credits are purchased in bundles (for example $10 for 10 credits) with no expiration date, and volume discounts are available for enterprise customers.[8]
For enterprise users with existing provider relationships, OpenRouter supports BYOK:[19]
OpenRouter's default privacy configuration includes:[17]
Users can configure requests to route only to providers with verified zero data retention policies. This feature may limit model availability and potentially affect latency or pricing but provides enhanced privacy guarantees for sensitive applications.[1] In October 2025, OpenRouter published an analysis examining whether implicit caching qualifies as zero data retention across providers, bringing greater transparency to the nuances of ZDR claims.[35]
As of July 2025, OpenRouter has achieved SOC 2 Type I compliance and maintains a public trust portal detailing security practices and compliance status.[21] The platform implements Cloudflare DDoS protection and rate limiting for security.[20]
OpenRouter provides official support for numerous SDKs and frameworks:[22]
The platform offers various tools for developers:[25]
OpenRouter operates in the rapidly growing LLM gateway and API aggregation market. As the market has matured, several alternatives have emerged, each with different strengths.[6][28]
The most straightforward alternative to OpenRouter is using model providers' APIs directly. Providers like OpenAI, Anthropic, Google, and Mistral AI all offer their own API endpoints. Direct access avoids any intermediary overhead or margin, but requires developers to maintain separate integrations, billing accounts, and failover logic for each provider. For teams that rely on only one or two models, direct access may be simpler. For teams working across many models, the integration burden grows quickly.
| Platform | Type | Models | Key Differentiator | Pricing |
|---|---|---|---|---|
| OpenRouter | Managed SaaS | 400+ | Largest model marketplace; unified billing; pass-through pricing | 5% platform fee; pay-as-you-go |
| Portkey | Managed SaaS | 1,600+ | Production observability, guardrails, governance, and prompt engineering management | Free plan; $49/month Pro |
| LiteLLM | Open-source / Self-hosted | 100+ | Full infrastructure control; self-hosted with no data leaving your servers | Free (open-source); custom enterprise |
| Eden AI | Managed SaaS | 500+ | Combines LLM routing with OCR, translation, speech, moderation, and other specialized AI | Pay-as-you-go |
| Helicone | Managed SaaS | 100+ | Observability-focused with session tracking and prompt management | Free hobby tier; $79/month Pro |
| Requesty | Managed SaaS | 400+ | Lightweight multi-provider routing with similar pass-through pricing model | Free plan; 5% markup |
| Kong AI Gateway | Self-hosted / Enterprise | Varies | Enterprise governance policies and existing Kong API gateway ecosystem integration | Custom enterprise pricing |
| Vercel AI Gateway | Managed SaaS | Varies | Optimized for the Next.js and Vercel ecosystem with edge deployment | $5 monthly credits; usage-based |
OpenRouter's primary competitive advantages include its large model catalog, the simplicity of its unified billing system, its established developer community of over 5 million users, and its reputation as a neutral marketplace without allegiance to any single model provider. Its public leaderboard and usage data also create a network effect: more developers using the platform generates more data about model quality, which attracts more developers.
LiteLLM appeals to teams that need self-hosted infrastructure for compliance or data sovereignty reasons, since all API keys stay on the user's own servers and requests go directly to providers. Portkey targets production engineering teams that need deeper observability, guardrails, and governance controls beyond what OpenRouter provides.
The period from late 2025 through early 2026 saw rapid evolution for both OpenRouter and the broader LLM ecosystem:
openrouter/free provides a zero-cost entry point that intelligently selects compatible free models based on request requirements.[37]According to company statements, OpenRouter's business model centers on:[7]
Co-founder Chris Clark stated: "We believe that inference costs will eclipse salaries as the dominant operating expense for most knowledge-based companies over the next five to 10 years."[7]
The platform has facilitated significant market transparency in AI model pricing and performance. Market share data from August 2025 reveals Google (22.5%) and Anthropic (22.3%) as leading providers, followed by OpenAI and various open-source providers.[27]
As of early 2026, OpenRouter serves:[7][29]
The platform has received endorsements from prominent figures in the AI community, including Andrej Karpathy, who highlighted its LLM rankings during a talk at Y Combinator Startup School on June 17, 2025.[7]
OpenRouter's stated principles emphasize a multi-model and multi-provider future for AI development.[3] The company highlights several key value propositions: