Command R

AI Companies Enterprise AI Large Language Models Natural Language Processing

23 min read

Updated Jun 23, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 23, 2026

Fact-checked

In review queue

Sources

34 citations

Revision

v8 · 4,648 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Command R is a family of enterprise large language models from Cohere, launched in March 2024 and built specifically for retrieval-augmented generation (RAG), multi-step tool use, and grounded text generation with verifiable inline citations. The family spans a 128,000-token context window and 10 optimized languages. Its members are Command R (35 billion parameters), command r plus (104 billion parameters), and Command R7B (7 billion parameters), with refreshed checkpoints released in August 2024 and an Arabic-optimized R7B variant in February 2025.^[1]^[2]^[3] In March 2025 Cohere positioned command a (111 billion parameters, 256K context) as the successor product line, describing Command R7B as the "smallest, fastest, and final model in our R family."^[4]^[5]^[23]

All open-weight releases in the family are distributed via hugging face under the Creative Commons Attribution-NonCommercial 4.0 (CC-BY-NC 4.0) license for research and non-commercial use, while commercial access is offered through Cohere's hosted API and cloud platform partners. This article covers the Command R family of large language models. For the broader successor lineup including Command A, Command A Vision, Command A Reasoning, and Command A Translate, see command a.

Background: Cohere

Cohere was founded in 2019 by Aidan Gomez, Ivan Zhang, and Nick Frosst, all University of Toronto alumni. Gomez was one of the eight co-authors of the 2017 paper "Attention Is All You Need," which introduced the Transformer architecture while he was an intern at Google Brain.^[6] Frosst also worked as a researcher at Google Brain before co-founding Cohere, and Zhang had previously collaborated with Gomez on research at FOR.ai. The company is headquartered in Toronto and San Francisco, with offices in Montreal, London, New York, Paris, and Seoul.^[7]

Unlike AI labs that target consumer chatbots, Cohere has pursued an enterprise-first strategy since its earliest days. Its models are available through Cohere's own API as well as through Amazon Web Services (Bedrock, SageMaker), Microsoft Azure (Azure AI Foundry), Google Cloud (Vertex AI), and Oracle Cloud Infrastructure (OCI Generative AI).^[8] This cloud-agnostic deployment, combined with support for virtual private cloud (VPC) and on-premises deployments, distinguishes Cohere in the enterprise AI market. By September 2025 Cohere had raised approximately $1.6 billion in total funding, with the latest extension bringing its valuation to $7 billion alongside a deepened partnership with AMD.^[9]

Before the R series, Cohere offered general-purpose "Command" and "Command Light" models that lacked the specialized RAG, tool use, and citation capabilities of the R family. The Command R series represented a deliberate pivot toward optimizing models for specific enterprise workflows rather than competing on general chat benchmarks alone.

When was Command R released?

Command R was announced on March 11, 2024, as Cohere's first model purpose-built for enterprise RAG workflows and tool use at scale.^[1]^[10] Cohere's model card describes it tersely as "a research release of a 35 billion parameter highly performant generative model."^[2] With 35 billion parameters and a 128,000-token context window, Command R represented a significant shift in Cohere's model strategy: rather than chasing general-purpose chat benchmarks, the model was optimized for high-precision retrieval, grounded generation, and low-latency production deployments. Cohere framed it as part of an emerging "scalable" category of models that "balance high efficiency with strong accuracy, enabling companies to move beyond proof of concept and into production."^[1]^[13]

Architecture and training

Command R uses an optimized autoregressive transformer decoder-only architecture. The model was pretrained on a large multilingual corpus and then refined through supervised fine-tuning (SFT) and preference training to align its behavior with human expectations for helpfulness and safety.^[2] It accepts text input and produces text output, using a proprietary chat format with special tokens delineating turns, system prompts, and tool interactions. The model implements grouped query attention (GQA) to reduce key-value cache memory at long context lengths, which is particularly important for RAG workloads that fill the 128K window with retrieved passages.^[11]

How many languages does Command R support?

Command R was optimized for strong performance across 10 key languages: English, French, Spanish, Italian, German, Brazilian Portuguese, Japanese, Korean, Simplified Chinese, and Arabic. An additional 13 languages were included in the pretraining data with lower optimization priority: Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, and Persian.^[2]

Cohere's tokenizer plays an important role in multilingual performance. Unlike many tokenizers that are heavily English-centric, Cohere designed its tokenizer for cross-lingual efficiency. In a comparison published by Sebastian Ruder, the Cohere tokenizer produced roughly 1.67 times fewer tokens than OpenAI's tokenizer for equivalent Japanese text.^[12] This efficiency directly reduces costs (since API pricing is per token) and allows more content to fit within the context window for non-English users.

Key capabilities

Command R introduced several capabilities that became hallmarks of the entire model family:^[2]^[13]

Grounded generation with citations. The model can generate responses that include fine-grained, inline citations referencing specific passages from provided source documents. This feature is central to enterprise RAG use cases where traceability and auditability matter.
Single-step tool use (function calling). Given a list of available tools and their schemas, Command R selects the appropriate tool and generates the correct parameters in JSON. The model includes a built-in directly_answer tool that allows it to abstain from calling external tools when the query can be answered from its own knowledge.
Multi-step tool use (agents). The model supports iterative Action, Observation, and Reflection cycles, allowing it to chain multiple tool calls across several steps to accomplish complex tasks.
Code generation. Command R handles writing, explaining, and translating code across programming languages. Cohere recommends low temperature or greedy decoding settings for code generation tasks.

Quantization

To support deployment on consumer-grade hardware, Cohere released quantized versions of Command R on Hugging Face. An 8-bit quantized version using bitsandbytes is available in the main repository, and a separate 4-bit quantized version is hosted at CohereLabs/c4ai-command-r-v01-4bit.^[2] These quantized versions make it practical to run the 35-billion-parameter model on hardware with limited GPU memory at the cost of some precision.

Launch pricing

At release, Command R was priced at $0.50 per million input tokens and $1.50 per million output tokens via Cohere's hosted API, with the August 2024 refresh reducing prices to $0.15/$0.60.^[14]

How does Command R+ differ from Command R?

Command R+ was released on April 4, 2024 (first on Microsoft Azure, then via Cohere's API and other clouds) as Cohere's flagship large language model.^[15]^[16] At 104 billion parameters with a 128,000-token context window, it offered stronger reasoning, improved multilingual performance, and better results on tool use benchmarks compared to the smaller Command R.^[3] Cohere positioned it as "the market leader in the emerging scalable category" of business-grade models, with the model card emphasizing that "the tool use in this model generation enables multi-step tool use which allows the model to combine multiple tools over multiple steps to accomplish difficult tasks."^[3]^[15]

Performance

On the Hugging Face Open LLM Leaderboard, Command R+ reported the following scores:^[3]

Benchmark	Command R+	DBRX Instruct	Mixtral 8x7B
ARC-Challenge	70.99	68.9	70.1
HellaSwag	88.6	89.0	87.6
MMLU	75.7	73.7	71.4
TruthfulQA	56.3	66.9	65.0
Winogrande	85.4	81.8	81.1
GSM8K	70.7	66.9	61.1
Average	74.6	74.5	72.7

Beyond standard academic benchmarks, Cohere highlighted Command R+'s performance on enterprise-relevant tasks. According to Cohere's internal evaluations, Command R+ outperformed GPT-4 Turbo on the ToolTalk (Hard) benchmark for conversational tool use and on the Berkeley Function Calling Leaderboard (BFCL) for single-turn function calling.^[15] In RAG citation fidelity, Command R+ surpassed GPT-4 Turbo in human evaluation. On multi-hop question answering benchmarks like HotpotQA, Bamboogle, and StrategyQA, it outperformed Claude 3 Sonnet and Mistral Large.

On the Chatbot Arena leaderboard captured on April 9, 2024, Command R+ ranked sixth overall and was the highest-ranked non-proprietary model at the time, outperforming some earlier versions of GPT-4.^[17] For translation tasks evaluated on FLoRES and WMT23, Command R+ was competitive with GPT-4 Turbo across French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, and Chinese. In the Chinese Chatbot Arena, Command R+ ranked third behind GPT-4 and Claude 3 Opus, both of which cost two to three times more.^[12]

Command R+ was named one of TIME Magazine's Best Inventions of 2024.^[18]

Architecture

Like Command R, Command R+ uses an optimized autoregressive decoder-only transformer trained with supervised fine-tuning and preference training, and it also relies on grouped query attention.^[11] It supports the same 10 primary languages and 13 additional pretraining languages as Command R, and its training data extended through February 2023, which Cohere recommends pairing with RAG to handle later facts.^[19] A 4-bit quantized version was released at CohereLabs/c4ai-command-r-plus-4bit for deployment on more constrained hardware.^[3]

Launch pricing

At release, Command R+ was priced at $3.00 per million input tokens and $15.00 per million output tokens via Cohere's hosted API.^[15] The August 2024 refresh later lowered these to $2.50/$10.00, matching GPT-4o.^[14]

What changed in the August 2024 refresh?

On August 30, 2024, Cohere released updated versions of both models: command-r-08-2024 and command-r-plus-08-2024.^[20]^[21] The refreshed Command R delivered roughly 50% higher throughput and 20% lower latencies while cutting the required hardware footprint by half compared to the previous version. The refreshed Command R+ delivered roughly 50% higher throughput and 25% lower latencies on the same hardware footprint. Key improvements applied to both refreshed models included:^[21]

Better tool selection and decision-making about when to use tools versus answering directly
Improved instruction following from system prompts (preambles)
Enhanced multilingual RAG with better search query generation in the user's language
Improved structured data analysis and creation from natural language instructions
Greater robustness to non-semantic prompt changes such as extra whitespace or newlines
Higher citation quality, with the option to disable citations for certain RAG workflows
Introduction of Safety Modes (beta) for controlling model safety behavior at STRICT or CONTEXTUAL levels

API pricing for Command R+ dropped from $3.00 to $2.50 per million input tokens and from $15.00 to $10.00 per million output tokens with the refresh.^[14]

Command R7B (December 2024)

Command R7B was announced on December 13, 2024, as the smallest and fastest model in the Command R family.^[22]^[23] At 7 billion parameters (8 billion in BF16 weights on disk) with a 128,000-token context window and a maximum output of 4,000 tokens, it was designed for high-throughput, latency-sensitive applications like chatbots and code assistants, and for on-device inference scenarios where larger models are impractical.^[24] The model's knowledge cutoff date is June 1, 2024.

Architecture

Command R7B introduced an architectural refinement shared with the later Command A model. The transformer interleaves three layers of sliding window attention (window size of 4,096 tokens) with Rotary Position Embedding (RoPE) and one layer of global attention without positional embeddings. This hybrid attention design balances the efficiency of local attention with the ability to attend to distant context when needed.^[24]

Language support

Command R7B expanded fully supported languages to 23: the original 10 plus Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, and Persian.^[24]

Performance

On the Hugging Face Open LLM Leaderboard (v2), Command R7B ranked first on average among similarly sized open-weight models:^[24]

Benchmark	Command R7B	Gemma 2 IT 9B	Llama 3.1 8B	Qwen 2.5 7B
IFEval	77.9	74.4	78.6	75.85
BBH	36.1	42.1	29.9	34.89
MATH (Hard)	26.4	0.2	19.3	0.0
GPQA	7.7	14.8	2.4	5.48
MUSR	11.6	9.74	8.41	8.45
MMLU-Pro	28.5	32.0	30.7	36.52
Average	31.4	28.9	28.2	26.87

Command R7B scored notably well on MATH (Hard), achieving 26.4 compared to near-zero scores from Gemma 2 IT 9B and Qwen 2.5 7B. Cohere described Command R7B as "the smallest, fastest, and final model in our R family," marking the conclusion of the R product line.^[23] API pricing was set at $0.0375 per million input tokens and $0.15 per million output tokens.^[14]

Command R7B Arabic (February 2025)

On February 27, 2025, Cohere released Command R7B Arabic (c4ai-command-r7b-arabic-02-2025), an 8-billion-parameter variant optimized for Modern Standard Arabic in addition to English, with the same 128,000-token context window as the base R7B model.^[25]^[26] The model targets enterprises in the MENA region and emphasizes instruction following, length control, RAG, and minimized code-switching between Arabic and English. Like other R-series releases, the weights were published on Hugging Face under CC-BY-NC 4.0.

How do grounded generation and citations work?

One of the most distinctive features of the Command R family is its built-in support for grounded generation with citations. Unlike many competing models that generate text without indicating where specific claims originate, Command R models are trained to produce fine-grained citations alongside their output. This capability is not a post-processing step or a plugin; it is trained directly into the model weights.^[2]^[3]

How it works

The grounded generation pipeline operates in conjunction with RAG. When a user submits a query:

The model generates an optimal set of search queries based on the user's input. Depending on the complexity of the request, it may generate zero, one, or multiple search queries.
A retrieval tool fetches relevant document passages (chunks) from an external data source. Cohere recommends document chunks of 100 to 400 words for optimal results.
The model generates a response using the retrieved passages, inserting inline citations that map specific spans of the response text to specific passages in the source documents.

Cohere offers two citation modes:^[2]

Mode	Description	Use Case
Accurate	The model first generates the complete response, then produces citations mapped to specific segments of the text	Applications where citation precision matters most
Fast	Citations are emitted inline as the response is produced, injected at the moment the model references a source	Streaming applications where low latency is important

The citation system enables users and automated systems to verify claims, trace information back to source documents, and identify when the model may be generating content not grounded in the provided sources. This is especially valuable in regulated industries like finance, healthcare, and legal services where factual accuracy is non-negotiable. For enterprises, inline citations reduce the risk of hallucination by making it straightforward to audit model outputs.

Multi-Step Tool Use

Command R models support multi-step tool use, allowing them to function as autonomous agents that plan and execute sequences of actions using multiple external tools. These capabilities were trained into the models through a mixture of supervised fine-tuning and preference fine-tuning using a specific prompt template.^[2]

Single-step versus multi-step

In single-step tool use, the model receives a user query along with a list of available tools (defined by their names, descriptions, and parameter schemas). The model selects the appropriate tool, generates the required parameters in JSON, and returns the result. This covers straightforward function-calling scenarios such as looking up information from a database or calling a calculator.

Multi-step tool use extends this by allowing the model to perform several inference cycles in a loop:

Action. The model decides which tool(s) to call and generates the parameters.
Observation. The tool returns its results to the model.
Reflection. The model evaluates the results and decides whether to call another tool, retry a failed call, or generate a final response.

This cycle repeats until the model determines it has enough information to answer the user's question. The model can call multiple tools in parallel when the calls are independent, and it can self-correct when a tool call fails, making multiple attempts to accomplish the task. The ability to recover from errors and retry with different parameters increases the overall success rate of agentic workflows.

For example, if a user asks "What is the current temperature in the capital of Brazil?", the model first calls a geographic lookup tool to determine that the capital of Brazil is Brasilia, then calls a weather API to retrieve the temperature in Brasilia, and finally combines both results into a coherent answer.

Model Comparison

The following table summarizes the key specifications and pricing across the Command R family and the immediate successor:

Model	Release Date	Parameters	Context Window	Input Price ($/1M tokens)	Output Price ($/1M tokens)	Open Weights
Command R	March 11, 2024	35B	128K	$0.15 (08-2024 refresh)	$0.60 (08-2024 refresh)	Yes (CC-BY-NC)
Command R+	April 4, 2024	104B	128K	$2.50 (08-2024 refresh)	$10.00 (08-2024 refresh)	Yes (CC-BY-NC)
Command R7B	December 13, 2024	7B	128K	$0.0375	$0.15	Yes (CC-BY-NC)
Command R7B Arabic	February 27, 2025	8B	128K	n/a (research)	n/a (research)	Yes (CC-BY-NC)
Command A (successor)	March 13, 2025	111B	256K	$2.50	$10.00	Yes (CC-BY-NC)

Sources: Cohere model cards and pricing pages.^[2]^[3]^[14]^[24]^[25]

Is Command R open source?

All Command R family models are released as open-weight research releases under the Creative Commons Attribution-NonCommercial 4.0 (CC-BY-NC 4.0) license, along with Cohere's Acceptable Use Policy.^[2]^[3]^[24] This means:

Research and non-commercial use is permitted freely. Researchers, students, and hobbyists can download the weights from Hugging Face and run the models locally.
Commercial use of the open weights is not permitted under this license. Organizations that want to use the models commercially must either access them through Cohere's API (with standard per-token pricing) or negotiate a separate commercial licensing agreement.
Cloud deployments through partners like AWS Bedrock, Azure, Google Cloud, and Oracle Cloud come with their own commercial terms.

This licensing strategy allows Cohere to benefit from community feedback and academic research while maintaining control over commercial revenue. It contrasts with broadly permissive open-weight releases like Meta's Llama family (under the Llama Community License) and with fully proprietary models such as OpenAI's GPT-4 and Anthropic's Claude. The distinction between "open weights" and "open source" is important: while the model weights are publicly available, the training data, training code, and full reproduction recipe are not released.

Successors: Command A and the Command A Family

In March 2025, Cohere released command a (c4ai-command-a-03-2025) as the successor to the Command R series and the company's most capable model at the time of release.^[4]^[5] With 111 billion parameters and a 256,000-token context window (double the context length of Command R+), it represented a generational leap in both performance and efficiency. A 55-page technical report co-authored by 228 Cohere contributors was published on arXiv as 2504.00698 in April 2025.^[27]

Key characteristics of Command A:

Hybrid attention architecture combining sliding window attention (window size 4,096) with global attention layers, scaled up from the Command R7B design.^[28]
Decentralized training pipeline with self-refinement and model merging.^[27]
Hardware efficiency: despite 111 billion parameters the model runs on just two GPUs (A100 or H100), with Cohere reporting 150% higher throughput than Command R+ 08-2024.^[5]
Token streaming for 100K-context requests reaching 73 tokens per second, compared with 38 tokens per second for GPT-4o and 32 tokens per second for DeepSeek-V3 according to Cohere's benchmarks.^[28]
23-language support (matching the R7B set) with improved handling of Arabic dialects.^[5]
Open weights on Hugging Face under CC-BY-NC 4.0.^[29]

Cohere subsequently extended the Command A family with specialized variants:

Model	Release	Context	Description
Command A Vision	July 31, 2025	128K	112B-parameter multimodal model combining a SigLIP2 vision encoder with the Command A text tower; supports up to 20 images per request^[30]
Command A Reasoning	August 21, 2025	256K	111B-parameter reasoning model designed to "think" before responding for customer service and complex enterprise tasks^[31]
Command A Translate	August 28, 2025	16K (8K in + 8K out)	111B-parameter translation model covering 23 languages, with an agentic "Deep Translation" multi-step refinement workflow^[32]

These variants share architectural lineage with Command A and are out of scope for this article. For details, see command a.

Enterprise Deployment Options

Cohere offers several deployment tiers for the Command R (and successor) model families:^[8]

Deployment Option	Description
Cohere API (SaaS)	Managed API with per-token pricing; simplest integration path
Cloud AI Platforms	Access through AWS Bedrock/SageMaker, Azure AI Foundry, Google Vertex AI, Oracle OCI
Virtual Private Cloud (VPC)	Models deployed within the customer's own cloud environment for data isolation
On-Premises	Full deployment on customer-owned hardware for maximum control

This flexibility is a core part of Cohere's enterprise pitch. Organizations in highly regulated industries (banking, healthcare, government) often require that data never leaves their controlled environment, and Cohere's deployment model accommodates this requirement. In July 2025, Cohere announced a partnership with Bell Canada to provide AI services to government and enterprise customers, with Bell Canada deploying Cohere's technology on its own data center infrastructure.

Complementary Products

The Command R models work in conjunction with other Cohere products to form a complete enterprise AI stack:

Embed models. Cohere's embedding models (Embed v4.0, Embed English v3.0, Embed Multilingual v3.0) convert text, images, and PDFs into dense vector representations. These embeddings power semantic search and retrieval pipelines that feed documents into Command R's RAG workflows.
Rerank models. Cohere's reranking models (Rerank v3.5 and the v4.0 Pro/Fast generation) take a set of retrieved documents and reorder them by relevance to the query. Using a reranker between retrieval and generation improves the quality of RAG outputs by prioritizing the most relevant passages.
Aya models. The Aya series (Aya Expanse 8B/32B in October 2024 and Aya Vision in March 2025) provides multilingual text generation and multimodal capabilities, with particular emphasis on underserved languages.
North platform. In January 2025 Cohere launched North, a turnkey AI workspace that packages Command models, retrieval, and integrations into a ready-to-deploy enterprise application; early-access customers include RBC, Dell, LG, Ensemble Health Partners, and Palantir.

Legacy and Current Status

As of 2026, Command A and its specialized variants are Cohere's flagship enterprise offerings, with the Command R family officially in maintenance and superseded for most new deployments. The Cohere blog and changelog explicitly describe Command R7B as the final model in the R series, and Command A as the next generation.^[23]^[4]

On September 15, 2025, Cohere formally deprecated the original launch checkpoints command-r-03-2024 and command-r-plus-04-2024 (along with the legacy command and command-light models), directing users to migrate to the August 2024 refreshes (command-r-08-2024, command-r-plus-08-2024) or to command-a-03-2025, which Cohere describes as "our strongest model across all the domains for which the legacy models were used."^[33] The retired checkpoints remained available to existing users through a deprecation window, with third-party platforms reporting an April 2026 removal date.^[33]^[34]

Despite the supersession, Command R and Command R+ retain significance for several reasons:

First mainstream "RAG-first" LLM family. While most LLM providers added RAG capabilities as an afterthought, the original Command R was architected and trained specifically for retrieval-augmented workflows with built-in citation generation, helping normalize grounded generation as a product feature across the industry.
Open-weight enterprise model. The CC-BY-NC release allowed Command R+ to become one of the most-downloaded open-weight LLMs of 2024 and the highest-ranked non-proprietary entry on the LMSYS Chatbot Arena leaderboard for several weeks following its launch.^[17]
Influence on architecture. The hybrid sliding-window + global attention design introduced with Command R7B was carried forward into Command A and became part of Cohere's published architectural identity.^[27]
Multilingual tokenization. The Cohere tokenizer's efficiency on non-English text remains a frequently cited advantage and is now standard across the Command A family.^[12]

For practical deployments today, Cohere generally recommends Command A (or Command A Reasoning for advanced agentic workflows, Command A Vision for multimodal use cases, and Command A Translate for translation), but the August 2024 Command R refreshes continue to be available via API, and the open weights for the whole R family remain downloadable on Hugging Face for projects that prefer their smaller footprint or that have already standardized on them.

References

Gomez, Aidan. "Command R: Retrieval-Augmented Generation at Production Scale." Cohere Blog, March 11, 2024. https://cohere.com/blog/command-r ↩
Cohere Labs. "c4ai-command-r-v01 (Model Card)." Hugging Face. https://huggingface.co/CohereLabs/c4ai-command-r-v01 ↩
Cohere Labs. "c4ai-command-r-plus (Model Card)." Hugging Face. https://huggingface.co/CohereLabs/c4ai-command-r-plus ↩
Cohere. "Announcing Command A." Cohere Changelog, March 13, 2025. https://docs.cohere.com/v2/changelog/command-a ↩
Cohere. "Introducing Command A: Max performance, minimal compute." Cohere Blog, March 13, 2025. https://cohere.com/blog/command-a ↩
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L., and Polosukhin, I. "Attention Is All You Need." *Advances in Neural Information Processing Systems*, 30 (2017). https://arxiv.org/abs/1706.03762 ↩
Wikipedia contributors. "Cohere." Wikipedia. https://en.wikipedia.org/wiki/Cohere ↩
Cohere. "Deployment Options." Cohere. https://cohere.com/deployment-options ↩
Maxwell, Tom. "Cohere hits $7B valuation a month after its last raise, partners with AMD." TechCrunch, September 24, 2025. https://techcrunch.com/2025/09/24/cohere-hits-7b-valuation-a-month-after-its-last-raise-partners-with-amd/ ↩
Franzen, Carl. "Cohere releases powerful 'Command-R' language model for enterprise use." VentureBeat, March 11, 2024. https://venturebeat.com/ai/cohere-releases-powerful-command-r-language-model-for-enterprise-use/ ↩
Cohere Labs. "c4ai-command-r-plus-08-2024 (Model Card)." Hugging Face. https://huggingface.co/CohereLabs/c4ai-command-r-plus-08-2024 ↩
Ruder, Sebastian. "Command R+." NLP News, April 2024. https://www.ruder.io/command-r/ ↩
Cohere. "Cohere's Command R Model." Cohere Documentation. https://docs.cohere.com/docs/command-r ↩
Cohere. "Pricing." Cohere. https://cohere.com/pricing ↩
Gomez, Aidan. "Introducing Command R+: A Scalable LLM Built for Business." Cohere Blog, April 4, 2024. https://cohere.com/blog/command-r-plus-microsoft-azure ↩
Franzen, Carl. "Cohere launches Command R+, a powerful enterprise LLM that beats GPT-4 Turbo." VentureBeat, April 4, 2024. https://venturebeat.com/ai/cohere-launches-command-r-a-powerful-llm-optimized-for-enterprise-ai/ ↩
Willison, Simon. "Command R+ now ranked 6th on the LMSYS Chatbot Arena." Simon Willison's Weblog, April 9, 2024. https://simonwillison.net/2024/Apr/9/command-r/ ↩
TIME. "Cohere Command R+." The 200 Best Inventions of 2024. https://time.com/collections/best-inventions-2024/7094931/cohese-command-r/ ↩
Cohere. "Cohere's Command R+ Model." Cohere Documentation. https://docs.cohere.com/docs/command-r-plus ↩
Cohere. "Updates to the Command R Series." Cohere Blog, August 30, 2024. https://cohere.com/blog/command-series-0824 ↩
Cohere. "Command models get an August refresh." Cohere Changelog, August 30, 2024. https://docs.cohere.com/changelog/command-gets-refreshed ↩
Cohere. "Introducing Command R7B: Fast and efficient generative AI." Cohere Blog, December 13, 2024. https://cohere.com/blog/command-r7b ↩
Cohere. "Announcing Command R7B." Cohere Changelog, December 13, 2024. https://docs.cohere.com/v2/changelog/command-r-7b ↩
Cohere Labs. "c4ai-command-r7b-12-2024 (Model Card)." Hugging Face. https://huggingface.co/CohereLabs/c4ai-command-r7b-12-2024 ↩
Cohere. "Introducing Command R7B Arabic." Cohere Blog, February 27, 2025. https://cohere.com/blog/command-r7b-arabic ↩
Cohere Labs. "c4ai-command-r7b-arabic-02-2025 (Model Card)." Hugging Face. https://huggingface.co/CohereLabs/c4ai-command-r7b-arabic-02-2025 ↩
Cohere et al. "Command A: An Enterprise-Ready Large Language Model." arXiv:2504.00698, April 2025. https://arxiv.org/abs/2504.00698 ↩
Franzen, Carl. "Cohere targets global enterprises with new highly multilingual Command A model requiring only 2 GPUs." VentureBeat, March 13, 2025. https://venturebeat.com/ai/cohere-targets-global-enterprises-with-new-highly-multilingual-command-a-model-requiring-only-2-gpus/ ↩
Cohere Labs. "c4ai-command-a-03-2025 (Model Card)." Hugging Face. https://huggingface.co/CohereLabs/c4ai-command-a-03-2025 ↩
Cohere Labs. "Introducing Command A Vision: Multimodal AI built for Business." Hugging Face Blog, July 31, 2025. https://huggingface.co/blog/CohereLabs/introducing-command-a-vision-07-2025 ↩
Cohere. "Command A Reasoning: Enterprise-grade control for AI agents." Cohere Blog, August 21, 2025. https://cohere.com/blog/command-a-reasoning ↩
Cohere. "Command A Translate: Secure translation for global enterprises." Cohere Blog, August 28, 2025. https://cohere.com/blog/command-a-translate ↩
Cohere. "Announcing Major Command Deprecations." Cohere Changelog, September 15, 2025. https://docs.cohere.com/changelog/2025-09-15-major-command-deprecations ↩
Cohere. "Deprecations." Cohere Documentation. https://docs.cohere.com/docs/deprecations ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

7 revisions by 1 contributors · full history

Suggest edit

What links here

Aya (Expanse / Vision)Cohere Cohere Command A Command A Reasoning Command R+Context window Global-MMLU RLOO (REINFORCE Leave-One-Out)llama.cpp vLLM

Background: Cohere

When was Command R released?

Architecture and training

How many languages does Command R support?

Key capabilities

Quantization

Launch pricing

How does Command R+ differ from Command R?

Performance

Architecture

Launch pricing

What changed in the August 2024 refresh?

Command R7B (December 2024)

Architecture

Language support

Performance

Command R7B Arabic (February 2025)

How do grounded generation and citations work?

How it works

Multi-Step Tool Use

Single-step versus multi-step

Model Comparison

Is Command R open source?

Successors: Command A and the Command A Family

Enterprise Deployment Options

Complementary Products

Legacy and Current Status

See also

References

Improve this article

Related Articles

IBM watsonx

Writer (AI company)

Salesforce AI

Snowflake AI

Cohere

AI21 Labs

What links here

Related Articles

IBM watsonx

Writer (AI company)

Salesforce AI

Snowflake AI

Cohere

AI21 Labs

What links here