# Command R

> Source: https://aiwiki.ai/wiki/command_r
> Updated: 2026-06-23
> Categories: AI Companies, Enterprise AI, Large Language Models, Natural Language Processing
> License: CC BY 4.0 (https://creativecommons.org/licenses/by/4.0/)
> From AI Wiki (https://aiwiki.ai), the free encyclopedia of artificial intelligence. Reuse freely with attribution to "AI Wiki (aiwiki.ai)".

**Command R** is a family of enterprise [large language models](/wiki/large_language_model) from [Cohere](/wiki/cohere), launched in March 2024 and built specifically for [retrieval-augmented generation](/wiki/retrieval_augmented_generation) (RAG), multi-step [tool use](/wiki/tool_use), and grounded text generation with verifiable inline citations. The family spans a 128,000-token context window and 10 optimized languages. Its members are Command R (35 billion parameters), [command r plus](/wiki/command_r_plus) (104 billion parameters), and Command R7B (7 billion parameters), with refreshed checkpoints released in August 2024 and an Arabic-optimized R7B variant in February 2025.[^1][^2][^3] In March 2025 Cohere positioned [command a](/wiki/command_a) (111 billion parameters, 256K context) as the successor product line, describing Command R7B as the "smallest, fastest, and final model in our R family."[^4][^5][^23]

All open-weight releases in the family are distributed via [hugging face](/wiki/hugging_face) under the Creative Commons Attribution-NonCommercial 4.0 (CC-BY-NC 4.0) license for research and non-commercial use, while commercial access is offered through Cohere's hosted API and cloud platform partners. This article covers the Command R family of large language models. For the broader successor lineup including Command A, Command A Vision, Command A Reasoning, and Command A Translate, see [command a](/wiki/command_a).

## Background: Cohere

Cohere was founded in 2019 by Aidan Gomez, Ivan Zhang, and [Nick Frosst](/wiki/nick_frosst), all University of Toronto alumni. Gomez was one of the eight co-authors of the 2017 paper "[Attention Is All You Need](/wiki/attention)," which introduced the [Transformer](/wiki/transformer) architecture while he was an intern at [Google Brain](/wiki/google_brain).[^6] Frosst also worked as a researcher at Google Brain before co-founding Cohere, and Zhang had previously collaborated with Gomez on research at FOR.ai. The company is headquartered in Toronto and San Francisco, with offices in Montreal, London, New York, Paris, and Seoul.[^7]

Unlike AI labs that target consumer chatbots, Cohere has pursued an enterprise-first strategy since its earliest days. Its models are available through Cohere's own API as well as through [Amazon Web Services](/wiki/amazon_web_services) (Bedrock, SageMaker), [Microsoft Azure](/wiki/azure_openai) (Azure AI Foundry), [Google Cloud](/wiki/google_cloud_terms) (Vertex AI), and Oracle Cloud Infrastructure (OCI [Generative AI](/wiki/generative_ai)).[^8] This cloud-agnostic deployment, combined with support for virtual private cloud (VPC) and on-premises deployments, distinguishes Cohere in the enterprise AI market. By September 2025 Cohere had raised approximately $1.6 billion in total funding, with the latest extension bringing its valuation to $7 billion alongside a deepened partnership with AMD.[^9]

Before the R series, Cohere offered general-purpose "Command" and "Command Light" models that lacked the specialized RAG, tool use, and citation capabilities of the R family. The Command R series represented a deliberate pivot toward optimizing models for specific enterprise workflows rather than competing on general chat benchmarks alone.

## When was Command R released?

Command R was announced on March 11, 2024, as Cohere's first model purpose-built for enterprise RAG workflows and tool use at scale.[^1][^10] Cohere's model card describes it tersely as "a research release of a 35 billion parameter highly performant generative model."[^2] With 35 billion parameters and a 128,000-token [context window](/wiki/context_window), Command R represented a significant shift in Cohere's model strategy: rather than chasing general-purpose chat benchmarks, the model was optimized for high-precision retrieval, grounded generation, and low-latency production deployments. Cohere framed it as part of an emerging "scalable" category of models that "balance high efficiency with strong accuracy, enabling companies to move beyond proof of concept and into production."[^1][^13]

### Architecture and training

Command R uses an optimized autoregressive [transformer](/wiki/transformer) decoder-only architecture. The model was pretrained on a large multilingual corpus and then refined through supervised [fine-tuning](/wiki/fine_tuning) (SFT) and preference training to align its behavior with human expectations for helpfulness and safety.[^2] It accepts text input and produces text output, using a proprietary chat format with special tokens delineating turns, system prompts, and tool interactions. The model implements [grouped query attention](/wiki/grouped_query_attention) (GQA) to reduce key-value cache memory at long context lengths, which is particularly important for RAG workloads that fill the 128K window with retrieved passages.[^11]

### How many languages does Command R support?

Command R was optimized for strong performance across 10 key languages: English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic. An additional 13 languages were included in the pretraining data with lower optimization priority: Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, and Persian.[^2]

Cohere's tokenizer plays an important role in multilingual performance. Unlike many tokenizers that are heavily English-centric, Cohere designed its tokenizer for cross-lingual efficiency. In a comparison published by Sebastian Ruder, the Cohere tokenizer produced roughly 1.67 times fewer tokens than [OpenAI](/wiki/openai)'s tokenizer for equivalent Japanese text.[^12] This efficiency directly reduces costs (since API pricing is per token) and allows more content to fit within the context window for non-English users.

### Key capabilities

Command R introduced several capabilities that became hallmarks of the entire model family:[^2][^13]

- **Grounded generation with citations.** The model can generate responses that include fine-grained, inline citations referencing specific passages from provided source documents. This feature is central to enterprise RAG use cases where traceability and auditability matter.
- **Single-step tool use (function calling).** Given a list of available tools and their schemas, Command R selects the appropriate tool and generates the correct parameters in JSON. The model includes a built-in `directly_answer` tool that allows it to abstain from calling external tools when the query can be answered from its own knowledge.
- **Multi-step tool use (agents).** The model supports iterative Action, Observation, and Reflection cycles, allowing it to chain multiple tool calls across several steps to accomplish complex tasks.
- **Code generation.** Command R handles writing, explaining, and translating code across programming languages. Cohere recommends low temperature or greedy decoding settings for code generation tasks.

### Quantization

To support deployment on consumer-grade hardware, Cohere released quantized versions of Command R on Hugging Face. An 8-bit quantized version using bitsandbytes is available in the main repository, and a separate 4-bit quantized version is hosted at `CohereLabs/c4ai-command-r-v01-4bit`.[^2] These quantized versions make it practical to run the 35-billion-parameter model on hardware with limited GPU memory at the cost of some precision.

### Launch pricing

At release, Command R was priced at $0.50 per million input tokens and $1.50 per million output tokens via Cohere's hosted API, with the August 2024 refresh reducing prices to $0.15/$0.60.[^14]

## How does Command R+ differ from Command R?

Command R+ was released on April 4, 2024 (first on Microsoft Azure, then via Cohere's API and other clouds) as Cohere's flagship large language model.[^15][^16] At 104 billion parameters with a 128,000-token context window, it offered stronger reasoning, improved multilingual performance, and better results on tool use benchmarks compared to the smaller Command R.[^3] Cohere positioned it as "the market leader in the emerging scalable category" of business-grade models, with the model card emphasizing that "the tool use in this model generation enables multi-step tool use which allows the model to combine multiple tools over multiple steps to accomplish difficult tasks."[^3][^15]

### Performance

On the Hugging Face Open LLM Leaderboard, Command R+ reported the following scores:[^3]

| Benchmark | Command R+ | DBRX Instruct | Mixtral 8x7B |
|---|---|---|---|
| ARC-Challenge | 70.99 | 68.9 | 70.1 |
| HellaSwag | 88.6 | 89.0 | 87.6 |
| [MMLU](/wiki/mmlu) | 75.7 | 73.7 | 71.4 |
| TruthfulQA | 56.3 | 66.9 | 65.0 |
| [Winogrande](/wiki/winogrande) | 85.4 | 81.8 | 81.1 |
| [GSM8K](/wiki/gsm8k) | 70.7 | 66.9 | 61.1 |
| **Average** | **74.6** | **74.5** | **72.7** |

Beyond standard academic benchmarks, Cohere highlighted Command R+'s performance on enterprise-relevant tasks. According to Cohere's internal evaluations, Command R+ outperformed [GPT-4](/wiki/gpt-4) Turbo on the ToolTalk (Hard) benchmark for conversational tool use and on the Berkeley Function Calling Leaderboard (BFCL) for single-turn function calling.[^15] In RAG citation fidelity, Command R+ surpassed GPT-4 Turbo in human evaluation. On multi-hop question answering benchmarks like HotpotQA, Bamboogle, and StrategyQA, it outperformed [Claude](/wiki/claude) 3 Sonnet and [Mistral](/wiki/mistral) Large.

On the [Chatbot Arena](/wiki/lmsys_chatbot_arena) leaderboard captured on April 9, 2024, Command R+ ranked sixth overall and was the highest-ranked non-proprietary model at the time, outperforming some earlier versions of GPT-4.[^17] For translation tasks evaluated on FLoRES and WMT23, Command R+ was competitive with GPT-4 Turbo across French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, and Chinese. In the Chinese Chatbot Arena, Command R+ ranked third behind GPT-4 and Claude 3 Opus, both of which cost two to three times more.[^12]

Command R+ was named one of TIME Magazine's Best Inventions of 2024.[^18]

### Architecture

Like Command R, Command R+ uses an optimized autoregressive decoder-only [transformer](/wiki/transformer) trained with supervised fine-tuning and preference training, and it also relies on grouped query attention.[^11] It supports the same 10 primary languages and 13 additional pretraining languages as Command R, and its training data extended through February 2023, which Cohere recommends pairing with RAG to handle later facts.[^19] A 4-bit quantized version was released at `CohereLabs/c4ai-command-r-plus-4bit` for deployment on more constrained hardware.[^3]

### Launch pricing

At release, Command R+ was priced at $3.00 per million input tokens and $15.00 per million output tokens via Cohere's hosted API.[^15] The August 2024 refresh later lowered these to $2.50/$10.00, matching GPT-4o.[^14]

## What changed in the August 2024 refresh?

On August 30, 2024, Cohere released updated versions of both models: `command-r-08-2024` and `command-r-plus-08-2024`.[^20][^21] The refreshed Command R delivered roughly 50% higher throughput and 20% lower latencies while cutting the required hardware footprint by half compared to the previous version. The refreshed Command R+ delivered roughly 50% higher throughput and 25% lower latencies on the same hardware footprint. Key improvements applied to both refreshed models included:[^21]

- Better tool selection and decision-making about when to use tools versus answering directly
- Improved instruction following from system prompts (preambles)
- Enhanced multilingual RAG with better search query generation in the user's language
- Improved structured data analysis and creation from natural language instructions
- Greater robustness to non-semantic prompt changes such as extra whitespace or newlines
- Higher citation quality, with the option to disable citations for certain RAG workflows
- Introduction of Safety Modes (beta) for controlling model safety behavior at STRICT or CONTEXTUAL levels

API pricing for Command R+ dropped from $3.00 to $2.50 per million input tokens and from $15.00 to $10.00 per million output tokens with the refresh.[^14]

## Command R7B (December 2024)

Command R7B was announced on December 13, 2024, as the smallest and fastest model in the Command R family.[^22][^23] At 7 billion parameters (8 billion in BF16 weights on disk) with a 128,000-token context window and a maximum output of 4,000 tokens, it was designed for high-throughput, latency-sensitive applications like chatbots and code assistants, and for on-device inference scenarios where larger models are impractical.[^24] The model's knowledge cutoff date is June 1, 2024.

### Architecture

Command R7B introduced an architectural refinement shared with the later Command A model. The transformer interleaves three layers of sliding window [attention](/wiki/attention) (window size of 4,096 tokens) with [Rotary Position Embedding](/wiki/rotary_position_embedding) ([RoPE](/wiki/positional_encoding)) and one layer of global attention without positional embeddings. This hybrid attention design balances the efficiency of local attention with the ability to attend to distant context when needed.[^24]

### Language support

Command R7B expanded fully supported languages to 23: the original 10 plus Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, and Persian.[^24]

### Performance

On the Hugging Face Open LLM Leaderboard (v2), Command R7B ranked first on average among similarly sized open-weight models:[^24]

| Benchmark | Command R7B | Gemma 2 IT 9B | [Llama](/wiki/llama) 3.1 8B | Qwen 2.5 7B |
|---|---|---|---|---|
| IFEval | 77.9 | 74.4 | 78.6 | 75.85 |
| BBH | 36.1 | 42.1 | 29.9 | 34.89 |
| MATH (Hard) | 26.4 | 0.2 | 19.3 | 0.0 |
| GPQA | 7.7 | 14.8 | 2.4 | 5.48 |
| MUSR | 11.6 | 9.74 | 8.41 | 8.45 |
| MMLU-Pro | 28.5 | 32.0 | 30.7 | 36.52 |
| **Average** | **31.4** | **28.9** | **28.2** | **26.87** |

Command R7B scored notably well on [MATH](/wiki/math) (Hard), achieving 26.4 compared to near-zero scores from [Gemma](/wiki/gemma) 2 IT 9B and [Qwen](/wiki/qwen) 2.5 7B. Cohere described Command R7B as "the smallest, fastest, and final model in our R family," marking the conclusion of the R product line.[^23] API pricing was set at $0.0375 per million input tokens and $0.15 per million output tokens.[^14]

## Command R7B Arabic (February 2025)

On February 27, 2025, Cohere released Command R7B Arabic (`c4ai-command-r7b-arabic-02-2025`), an 8-billion-parameter variant optimized for Modern Standard Arabic in addition to English, with the same 128,000-token context window as the base R7B model.[^25][^26] The model targets enterprises in the MENA region and emphasizes instruction following, length control, RAG, and minimized code-switching between Arabic and English. Like other R-series releases, the weights were published on Hugging Face under CC-BY-NC 4.0.

## How do grounded generation and citations work?

One of the most distinctive features of the Command R family is its built-in support for grounded generation with citations. Unlike many competing models that generate text without indicating where specific claims originate, Command R models are trained to produce fine-grained citations alongside their output. This capability is not a post-processing step or a plugin; it is trained directly into the model weights.[^2][^3]

### How it works

The grounded generation pipeline operates in conjunction with RAG. When a user submits a query:

1. The model generates an optimal set of search queries based on the user's input. Depending on the complexity of the request, it may generate zero, one, or multiple search queries.
2. A retrieval tool fetches relevant document passages (chunks) from an external data source. Cohere recommends document chunks of 100 to 400 words for optimal results.
3. The model generates a response using the retrieved passages, inserting inline citations that map specific spans of the response text to specific passages in the source documents.

Cohere offers two citation modes:[^2]

| Mode | Description | Use Case |
|---|---|---|
| Accurate | The model first generates the complete response, then produces citations mapped to specific segments of the text | Applications where citation precision matters most |
| Fast | Citations are emitted inline as the response is produced, injected at the moment the model references a source | Streaming applications where low latency is important |

The citation system enables users and automated systems to verify claims, trace information back to source documents, and identify when the model may be generating content not grounded in the provided sources. This is especially valuable in regulated industries like finance, healthcare, and legal services where factual accuracy is non-negotiable. For enterprises, inline citations reduce the risk of [hallucination](/wiki/hallucination) by making it straightforward to audit model outputs.

## Multi-Step Tool Use

Command R models support multi-step tool use, allowing them to function as autonomous agents that plan and execute sequences of actions using multiple external tools. These capabilities were trained into the models through a mixture of supervised fine-tuning and preference fine-tuning using a specific prompt template.[^2]

### Single-step versus multi-step

In single-step tool use, the model receives a user query along with a list of available tools (defined by their names, descriptions, and parameter schemas). The model selects the appropriate tool, generates the required parameters in JSON, and returns the result. This covers straightforward function-calling scenarios such as looking up information from a database or calling a calculator.

Multi-step tool use extends this by allowing the model to perform several inference cycles in a loop:

1. **Action.** The model decides which tool(s) to call and generates the parameters.
2. **Observation.** The tool returns its results to the model.
3. **Reflection.** The model evaluates the results and decides whether to call another tool, retry a failed call, or generate a final response.

This cycle repeats until the model determines it has enough information to answer the user's question. The model can call multiple tools in parallel when the calls are independent, and it can self-correct when a tool call fails, making multiple attempts to accomplish the task. The ability to recover from errors and retry with different parameters increases the overall success rate of agentic workflows.

For example, if a user asks "What is the current temperature in the capital of Brazil?", the model first calls a geographic lookup tool to determine that the capital of Brazil is Brasilia, then calls a weather API to retrieve the temperature in Brasilia, and finally combines both results into a coherent answer.

## Model Comparison

The following table summarizes the key specifications and pricing across the Command R family and the immediate successor:

| Model | Release Date | Parameters | Context Window | Input Price ($/1M tokens) | Output Price ($/1M tokens) | Open Weights |
|---|---|---|---|---|---|---|
| Command R | March 11, 2024 | 35B | 128K | $0.15 (08-2024 refresh) | $0.60 (08-2024 refresh) | Yes (CC-BY-NC) |
| Command R+ | April 4, 2024 | 104B | 128K | $2.50 (08-2024 refresh) | $10.00 (08-2024 refresh) | Yes (CC-BY-NC) |
| Command R7B | December 13, 2024 | 7B | 128K | $0.0375 | $0.15 | Yes (CC-BY-NC) |
| Command R7B Arabic | February 27, 2025 | 8B | 128K | n/a (research) | n/a (research) | Yes (CC-BY-NC) |
| Command A (successor) | March 13, 2025 | 111B | 256K | $2.50 | $10.00 | Yes (CC-BY-NC) |

Sources: Cohere model cards and pricing pages.[^2][^3][^14][^24][^25]

## Is Command R open source?

All Command R family models are released as open-weight research releases under the Creative Commons Attribution-NonCommercial 4.0 (CC-BY-NC 4.0) license, along with Cohere's Acceptable Use Policy.[^2][^3][^24] This means:

- **Research and non-commercial use** is permitted freely. Researchers, students, and hobbyists can download the weights from Hugging Face and run the models locally.
- **Commercial use** of the open weights is not permitted under this license. Organizations that want to use the models commercially must either access them through Cohere's API (with standard per-token pricing) or negotiate a separate commercial licensing agreement.
- **Cloud deployments** through partners like AWS Bedrock, Azure, Google Cloud, and Oracle Cloud come with their own commercial terms.

This licensing strategy allows Cohere to benefit from community feedback and academic research while maintaining control over commercial revenue. It contrasts with broadly permissive open-weight releases like [Meta](/wiki/meta_ai)'s Llama family (under the Llama Community License) and with fully proprietary models such as OpenAI's GPT-4 and [Anthropic](/wiki/anthropic)'s Claude. The distinction between "open weights" and "open source" is important: while the model weights are publicly available, the training data, training code, and full reproduction recipe are not released.

## Successors: Command A and the Command A Family

In March 2025, Cohere released [command a](/wiki/command_a) (`c4ai-command-a-03-2025`) as the successor to the Command R series and the company's most capable model at the time of release.[^4][^5] With 111 billion parameters and a 256,000-token context window (double the context length of Command R+), it represented a generational leap in both performance and efficiency. A 55-page technical report co-authored by 228 Cohere contributors was published on arXiv as 2504.00698 in April 2025.[^27]

Key characteristics of Command A:

- Hybrid attention architecture combining sliding window attention (window size 4,096) with global attention layers, scaled up from the Command R7B design.[^28]
- Decentralized training pipeline with self-refinement and model merging.[^27]
- Hardware efficiency: despite 111 billion parameters the model runs on just two GPUs (A100 or H100), with Cohere reporting 150% higher throughput than Command R+ 08-2024.[^5]
- Token streaming for 100K-context requests reaching 73 tokens per second, compared with 38 tokens per second for [GPT-4o](/wiki/gpt4) and 32 tokens per second for [DeepSeek](/wiki/deepseek)-V3 according to Cohere's benchmarks.[^28]
- 23-language support (matching the R7B set) with improved handling of Arabic dialects.[^5]
- Open weights on Hugging Face under CC-BY-NC 4.0.[^29]

Cohere subsequently extended the Command A family with specialized variants:

| Model | Release | Context | Description |
|---|---|---|---|
| Command A Vision | July 31, 2025 | 128K | 112B-parameter multimodal model combining a SigLIP2 vision encoder with the Command A text tower; supports up to 20 images per request[^30] |
| Command A Reasoning | August 21, 2025 | 256K | 111B-parameter reasoning model designed to "think" before responding for customer service and complex enterprise tasks[^31] |
| Command A Translate | August 28, 2025 | 16K (8K in + 8K out) | 111B-parameter translation model covering 23 languages, with an agentic "Deep Translation" multi-step refinement workflow[^32] |

These variants share architectural lineage with Command A and are out of scope for this article. For details, see [command a](/wiki/command_a).

## Enterprise Deployment Options

Cohere offers several deployment tiers for the Command R (and successor) model families:[^8]

| Deployment Option | Description |
|---|---|
| Cohere API (SaaS) | Managed API with per-token pricing; simplest integration path |
| Cloud AI Platforms | Access through AWS Bedrock/SageMaker, Azure AI Foundry, Google Vertex AI, Oracle OCI |
| Virtual Private Cloud (VPC) | Models deployed within the customer's own cloud environment for data isolation |
| On-Premises | Full deployment on customer-owned hardware for maximum control |

This flexibility is a core part of Cohere's enterprise pitch. Organizations in highly regulated industries (banking, healthcare, government) often require that data never leaves their controlled environment, and Cohere's deployment model accommodates this requirement. In July 2025, Cohere announced a partnership with Bell Canada to provide AI services to government and enterprise customers, with Bell Canada deploying Cohere's technology on its own data center infrastructure.

## Complementary Products

The Command R models work in conjunction with other Cohere products to form a complete enterprise AI stack:

- **Embed models.** Cohere's embedding models (Embed v4.0, Embed English v3.0, Embed Multilingual v3.0) convert text, images, and PDFs into dense vector representations. These [embeddings](/wiki/embeddings) power semantic search and retrieval pipelines that feed documents into Command R's RAG workflows.
- **Rerank models.** Cohere's reranking models (Rerank v3.5 and the v4.0 Pro/Fast generation) take a set of retrieved documents and reorder them by relevance to the query. Using a reranker between retrieval and generation improves the quality of RAG outputs by prioritizing the most relevant passages.
- **Aya models.** The Aya series (Aya Expanse 8B/32B in October 2024 and Aya Vision in March 2025) provides multilingual text generation and multimodal capabilities, with particular emphasis on underserved languages.
- **North platform.** In January 2025 Cohere launched North, a turnkey AI workspace that packages Command models, retrieval, and integrations into a ready-to-deploy enterprise application; early-access customers include RBC, Dell, LG, Ensemble Health Partners, and Palantir.

## Legacy and Current Status

As of 2026, Command A and its specialized variants are Cohere's flagship enterprise offerings, with the Command R family officially in maintenance and superseded for most new deployments. The Cohere blog and changelog explicitly describe Command R7B as the final model in the R series, and Command A as the next generation.[^23][^4]

On September 15, 2025, Cohere formally deprecated the original launch checkpoints `command-r-03-2024` and `command-r-plus-04-2024` (along with the legacy `command` and `command-light` models), directing users to migrate to the August 2024 refreshes (`command-r-08-2024`, `command-r-plus-08-2024`) or to `command-a-03-2025`, which Cohere describes as "our strongest model across all the domains for which the legacy models were used."[^33] The retired checkpoints remained available to existing users through a deprecation window, with third-party platforms reporting an April 2026 removal date.[^33][^34]

Despite the supersession, Command R and Command R+ retain significance for several reasons:

- **First mainstream "RAG-first" LLM family.** While most LLM providers added RAG capabilities as an afterthought, the original Command R was architected and trained specifically for retrieval-augmented workflows with built-in citation generation, helping normalize grounded generation as a product feature across the industry.
- **Open-weight enterprise model.** The CC-BY-NC release allowed Command R+ to become one of the most-downloaded open-weight LLMs of 2024 and the highest-ranked non-proprietary entry on the LMSYS Chatbot Arena leaderboard for several weeks following its launch.[^17]
- **Influence on architecture.** The hybrid sliding-window + global attention design introduced with Command R7B was carried forward into Command A and became part of Cohere's published architectural identity.[^27]
- **Multilingual tokenization.** The Cohere tokenizer's efficiency on non-English text remains a frequently cited advantage and is now standard across the Command A family.[^12]

For practical deployments today, Cohere generally recommends Command A (or Command A Reasoning for advanced agentic workflows, Command A Vision for multimodal use cases, and Command A Translate for translation), but the August 2024 Command R refreshes continue to be available via API, and the open weights for the whole R family remain downloadable on Hugging Face for projects that prefer their smaller footprint or that have already standardized on them.

## See also

- [cohere](/wiki/cohere) - Company background and broader product portfolio
- [command r plus](/wiki/command_r_plus) - Detailed article on the 104B flagship of the R series
- [command a](/wiki/command_a) - Successor family of models from March 2025 onward
- [retrieval augmented generation](/wiki/retrieval_augmented_generation) - The technique that the Command R family was designed around
- [tool use](/wiki/tool_use) - Function calling and agentic loops
- [grouped query attention](/wiki/grouped_query_attention) - Attention optimization used in Command R and Command R+
- [hugging face](/wiki/hugging_face) - Primary distribution channel for the open-weight releases
- [large language model](/wiki/large_language_model) - General overview of LLM architectures
- [transformer](/wiki/transformer) - Underlying architecture introduced in the 2017 "Attention Is All You Need" paper co-authored by Cohere CEO Aidan Gomez

## References

[^1]: Gomez, Aidan. "Command R: Retrieval-Augmented Generation at Production Scale." Cohere Blog, March 11, 2024. https://cohere.com/blog/command-r

[^2]: Cohere Labs. "c4ai-command-r-v01 (Model Card)." Hugging Face. https://huggingface.co/CohereLabs/c4ai-command-r-v01

[^3]: Cohere Labs. "c4ai-command-r-plus (Model Card)." Hugging Face. https://huggingface.co/CohereLabs/c4ai-command-r-plus

[^4]: Cohere. "Announcing Command A." Cohere Changelog, March 13, 2025. https://docs.cohere.com/v2/changelog/command-a

[^5]: Cohere. "Introducing Command A: Max performance, minimal compute." Cohere Blog, March 13, 2025. https://cohere.com/blog/command-a

[^6]: Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. "Attention Is All You Need." *Advances in Neural Information Processing Systems*, 30 (2017). https://arxiv.org/abs/1706.03762

[^7]: Wikipedia contributors. "Cohere." Wikipedia. https://en.wikipedia.org/wiki/Cohere

[^8]: Cohere. "Deployment Options." Cohere. https://cohere.com/deployment-options

[^9]: Maxwell, Tom. "Cohere hits $7B valuation a month after its last raise, partners with AMD." TechCrunch, September 24, 2025. https://techcrunch.com/2025/09/24/cohere-hits-7b-valuation-a-month-after-its-last-raise-partners-with-amd/

[^10]: Franzen, Carl. "Cohere releases powerful 'Command-R' language model for enterprise use." VentureBeat, March 11, 2024. https://venturebeat.com/ai/cohere-releases-powerful-command-r-language-model-for-enterprise-use/

[^11]: Cohere Labs. "c4ai-command-r-plus-08-2024 (Model Card)." Hugging Face. https://huggingface.co/CohereLabs/c4ai-command-r-plus-08-2024

[^12]: Ruder, Sebastian. "Command R+." NLP News, April 2024. https://www.ruder.io/command-r/

[^13]: Cohere. "Cohere's Command R Model." Cohere Documentation. https://docs.cohere.com/docs/command-r

[^14]: Cohere. "Pricing." Cohere. https://cohere.com/pricing

[^15]: Gomez, Aidan. "Introducing Command R+: A Scalable LLM Built for Business." Cohere Blog, April 4, 2024. https://cohere.com/blog/command-r-plus-microsoft-azure

[^16]: Franzen, Carl. "Cohere launches Command R+, a powerful enterprise LLM that beats GPT-4 Turbo." VentureBeat, April 4, 2024. https://venturebeat.com/ai/cohere-launches-command-r-a-powerful-llm-optimized-for-enterprise-ai/

[^17]: Willison, Simon. "Command R+ now ranked 6th on the LMSYS Chatbot Arena." Simon Willison's Weblog, April 9, 2024. https://simonwillison.net/2024/Apr/9/command-r/

[^18]: TIME. "Cohere Command R+." The 200 Best Inventions of 2024. https://time.com/collections/best-inventions-2024/7094931/cohese-command-r/

[^19]: Cohere. "Cohere's Command R+ Model." Cohere Documentation. https://docs.cohere.com/docs/command-r-plus

[^20]: Cohere. "Updates to the Command R Series." Cohere Blog, August 30, 2024. https://cohere.com/blog/command-series-0824

[^21]: Cohere. "Command models get an August refresh." Cohere Changelog, August 30, 2024. https://docs.cohere.com/changelog/command-gets-refreshed

[^22]: Cohere. "Introducing Command R7B: Fast and efficient generative AI." Cohere Blog, December 13, 2024. https://cohere.com/blog/command-r7b

[^23]: Cohere. "Announcing Command R7B." Cohere Changelog, December 13, 2024. https://docs.cohere.com/v2/changelog/command-r-7b

[^24]: Cohere Labs. "c4ai-command-r7b-12-2024 (Model Card)." Hugging Face. https://huggingface.co/CohereLabs/c4ai-command-r7b-12-2024

[^25]: Cohere. "Introducing Command R7B Arabic." Cohere Blog, February 27, 2025. https://cohere.com/blog/command-r7b-arabic

[^26]: Cohere Labs. "c4ai-command-r7b-arabic-02-2025 (Model Card)." Hugging Face. https://huggingface.co/CohereLabs/c4ai-command-r7b-arabic-02-2025

[^27]: Cohere et al. "Command A: An Enterprise-Ready Large Language Model." arXiv:2504.00698, April 2025. https://arxiv.org/abs/2504.00698

[^28]: Franzen, Carl. "Cohere targets global enterprises with new highly multilingual Command A model requiring only 2 GPUs." VentureBeat, March 13, 2025. https://venturebeat.com/ai/cohere-targets-global-enterprises-with-new-highly-multilingual-command-a-model-requiring-only-2-gpus/

[^29]: Cohere Labs. "c4ai-command-a-03-2025 (Model Card)." Hugging Face. https://huggingface.co/CohereLabs/c4ai-command-a-03-2025

[^30]: Cohere Labs. "Introducing Command A Vision: Multimodal AI built for Business." Hugging Face Blog, July 31, 2025. https://huggingface.co/blog/CohereLabs/introducing-command-a-vision-07-2025

[^31]: Cohere. "Command A Reasoning: Enterprise-grade control for AI agents." Cohere Blog, August 21, 2025. https://cohere.com/blog/command-a-reasoning

[^32]: Cohere. "Command A Translate: Secure translation for global enterprises." Cohere Blog, August 28, 2025. https://cohere.com/blog/command-a-translate

[^33]: Cohere. "Announcing Major Command Deprecations." Cohere Changelog, September 15, 2025. https://docs.cohere.com/changelog/2025-09-15-major-command-deprecations

[^34]: Cohere. "Deprecations." Cohere Documentation. https://docs.cohere.com/docs/deprecations