Command A Reasoning

AI Companies Large Language Models Reasoning Models

10 min read

Updated May 31, 2026

Suggest edit History Talk

RawGraph

Last edited

May 31, 2026

Fact-checked

In review queue

Sources

10 citations

Revision

v1 · 2,039 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Command A Reasoning is an enterprise reasoning model released by Cohere on August 21, 2025, as part of the Command A family of large language models. It pairs an explicit chain of thought process with the deployment efficiency and security focus that Cohere builds its products around, and it ships with a controllable thinking budget so a team can decide how much the model deliberates before it answers. The weights carry the model identifier command-a-reasoning-08-2025, and Cohere positions the system for agentic and tool using work, multilingual enterprise tasks, and on premises deployment where data cannot leave a customer's own infrastructure. ^[1]^[2]

The model is a fine tune of the earlier Command A instruct model rather than a replacement for it. The model card lists it as fine tuned from c4ai-command-a-03-2025, so it shares the same 111 billion parameter footprint and 256K token context window, and it adds a reasoning mode that Cohere reports lifts performance on agentic benchmarks above its non reasoning sibling and above several competing enterprise reasoning systems. ^[1]^[3]

What Command A Reasoning is

Command A Reasoning is an auto regressive transformer that generates intermediate reasoning tokens before it produces a final response. This is the pattern shared by the broader class of reasoning models, where the model works through a problem step by step in a scratchpad style trace and then commits to an answer. Cohere's contribution here is less about the raw idea and more about packaging it for buyers who care about cost, latency, and control, the practical constraints that govern whether a reasoning model can actually run inside a bank, a hospital, or a government department. ^[1]^[2]

The company describes the target as enterprise tasks done with full control. In practice that means three things. The model can act as the planner and tool caller inside AI agents that retrieve documents, call APIs, and chain multiple steps. It can run in private deployments rather than only through a hosted API. And it gives administrators a dial over how much compute each query is allowed to spend on thinking, which matters a great deal once a reasoning model is answering thousands of queries a day. Cohere frames customer service, research assistance, and internal knowledge work as primary use cases, the kinds of jobs where a wrong answer carries real cost and a record of the model's reasoning is useful. ^[1]^[7]

Command A Reasoning was built by Cohere together with Cohere Labs, the company's research division. Cohere was founded in 2019 by Aidan Gomez and colleagues, and it has concentrated on selling language models to businesses rather than chasing a consumer chatbot audience. The reasoning model fits that strategy. It is meant to be one of the engines behind Cohere's North platform, an agentic workspace that connects models to enterprise data and tools. ^[1]^[2]

The adjustable thinking budget

The headline feature is a token budget for reasoning. A developer can set how many tokens the model is permitted to spend on its internal thinking before it has to answer. A larger budget lets the model reason longer on hard problems, which tends to raise accuracy on math, planning, and multi step tool use. A smaller budget caps the spend, which lowers cost and latency for routine work where deep deliberation buys little. ^[1]^[4]

This turns the usual tradeoff into something a team can manage per workload instead of accepting a single fixed behavior. A nightly batch job that analyzes contracts can be given a generous budget, while an interactive support assistant can be kept lean so replies stay fast. Cohere frames this as a way to balance quality against compute, and it is the main reason the company describes the model as offering full control. ^[1]^[4]

Reasoning can also be switched off entirely. In Cohere's chat template the behavior is governed by a reasoning flag that defaults to on, and setting it off makes the model skip the thinking stage. When thinking runs, the model emits its trace between <START_THINKING> and <END_THINKING> markers before the final answer. With thinking disabled, Command A Reasoning behaves like a standard instruct model and responds directly, which lets one deployment serve both reasoning heavy and latency sensitive traffic without swapping models. ^[1]^[3]

Parameters, context, and deployment

Command A Reasoning has 111 billion parameters and uses an optimized transformer architecture, with weights released in BF16 precision. Its maximum context length is 256K tokens, long enough to hold large document sets, codebases, or extended agent transcripts in a single prompt, and it can generate up to 32K output tokens. ^[3]

Deployment efficiency is a deliberate design goal. The model can run on a single H100 or A100 GPU at a reduced context length of 128K tokens, and it reaches the full 256K context when spread across multiple GPUs. Cohere recommends 4 H100 GPUs for production serving and notes that 4 A100 GPUs work for evaluation and testing. The single GPU option lowers the hardware bar for private deployment, which is the setting Cohere cares most about, since many of its customers in regulated industries want the model running inside their own environment. ^[1]^[3]

Specification	Detail
Developer	Cohere and Cohere Labs
Model identifier	command-a-reasoning-08-2025
Release date	August 21, 2025
Parameters	111 billion
Architecture	Optimized auto regressive transformer, BF16 weights
Base model	c4ai-command-a-03-2025
Maximum context	256K tokens
Maximum output	32K tokens
Single GPU context	128K tokens on one H100 or A100
Full context deployment	256K tokens across multiple GPUs
Production hardware	4 H100 GPUs recommended, 4 A100 for testing
Modality	Text in, text out
Reasoning control	Adjustable thinking token budget, can be disabled
Languages	23
Weights license	CC-BY-NC 4.0, research use
Availability	Cohere platform, North, Hugging Face

Agentic, tool use, and multilingual strengths

Cohere built Command A Reasoning for agentic workloads, the kind where a model has to decide which tool to call, pass the right arguments, read the result, and then plan the next step. The thinking budget feeds directly into this, because a model that can reason before it acts tends to choose tools more reliably and recover from errors better than one that responds in a single pass. Function calling and multi step tool use are exactly what the BFCL and Tau-bench evaluations measure, and they are also the operations a deployed agent performs all day, so progress on those benchmarks maps onto the work the model is meant to do. ^[1]^[5]

The model supports 23 languages, including English, French, Spanish, Italian, German, Portuguese, Japanese, Korean, Arabic, Chinese, Russian, Polish, Turkish, Vietnamese, Dutch, Czech, Indonesian, Ukrainian, Romanian, Greek, Hindi, Hebrew, and Persian. Multilingual coverage has been a consistent priority across the Command line, and it matters for the global enterprises that are Cohere's main customers. ^[3]

Like other Cohere models, Command A Reasoning includes configurable safety behavior so that organizations can tune how the model handles sensitive content for their own context rather than accepting one fixed policy. Cohere also reports that the model reaches a strong balance between refusing genuinely harmful requests and staying useful, which it frames as the best safety and usefulness tradeoff among the systems it compared. ^[1]^[5]

Benchmark results

Cohere reports that Command A Reasoning leads gpt-oss-120b, DeepSeek-R1 0528, and Mistral Magistral Medium on a set of agentic evaluations, and that it improves on the earlier Command A instruct model. The evaluations Cohere highlights cover function calling on BFCL-v3, multi turn tool use in simulated environments on Tau-bench, and open ended research style tasks on DeepResearch Bench. The published comparisons appear as charts in Cohere's announcement rather than as a numeric table, so the summary below records each benchmark, what it measures, and the verified directional result rather than inventing exact figures that were presented only in graphical form. ^[1]^[5]

Benchmark	What it measures	Reported result
BFCL-v3 (Berkeley Function Calling Leaderboard)	Accuracy of selecting and calling functions or tools	Leads gpt-oss-120b, DeepSeek-R1 0528, and Mistral Magistral Medium
Tau-bench	Multi turn tool use in simulated task environments	Leads the same comparison set
DeepResearch Bench	Open ended, multi step research style tasks	Leads the same comparison set
Safety versus usefulness	Tradeoff between refusing harmful requests and staying helpful	Cohere reports the best tradeoff among the models compared

Readers who need exact numeric scores should consult Cohere's announcement post and the model card directly, since the figures there are the authoritative source and are subject to change as evaluation suites are updated. ^[1]^[3]

Licensing and availability

The model weights are published on Hugging Face under a CC-BY-NC 4.0 license, which permits non commercial and research use and requires attribution. Commercial use runs through Cohere, and the company also asks users to follow the Cohere Labs Acceptable Use Policy. This split, open weights for research with a commercial path through the vendor, matches how Cohere has released earlier Command models. ^[3]^[10]

For production use, Command A Reasoning is served on the Cohere platform and through the North agentic workspace, and it can be deployed privately on a customer's own infrastructure for organizations that cannot send data to a third party. Cohere lists the model in its API catalog under the command-a-reasoning-08-2025 identifier alongside the rest of the Command line. ^[1]^[2]^[8]

Relation to Command A and Command R

Command A Reasoning is the reasoning oriented member of the Command A generation. The base Command A model, released in March 2025, introduced the 111 billion parameter size and the 256K context window, and it was notable for running on as few as two GPUs while matching larger rivals on enterprise tasks. Command A Reasoning keeps that profile and layers an explicit thinking stage on top, so it can be read as Command A taught to deliberate, with the option to turn the deliberation off and behave like the original. ^[3]^[9]

Both models descend from the earlier Command R and Command R Plus models, which established Cohere's focus on retrieval augmented generation and tool use for business users. Command A and its reasoning variant succeed that R series and push further on efficiency, context length, and now controllable reasoning. ^[6]

Limitations

The non commercial weights license means the open release is for research only, so teams that want to run the weights in a commercial product cannot do so from Hugging Face alone and must go through Cohere. The single GPU configuration trades context for hardware savings, dropping to 128K tokens, so workloads that genuinely need the full 256K window require more than one GPU. As with reasoning models in general, a generous thinking budget raises both cost and latency, which is exactly why the budget control exists, and getting good results means tuning that budget per task rather than assuming more thinking is always better. Finally, the headline benchmark comparisons come from Cohere's own evaluation and are best read alongside independent testing on a buyer's specific use case. ^[1]^[3]

References

Cohere. "Introducing Command A Reasoning: Excelling at enterprise tasks with full control." Cohere Blog, August 21, 2025. https://cohere.com/blog/command-a-reasoning ↩
Cohere. "Command A Reasoning." Cohere Documentation. https://docs.cohere.com/docs/command-a-reasoning ↩
Cohere Labs. "command-a-reasoning-08-2025 model card." Hugging Face. https://huggingface.co/CohereLabs/command-a-reasoning-08-2025 ↩
Cohere. "Reasoning." Cohere Documentation. https://docs.cohere.com/docs/reasoning ↩
The Decoder. "Cohere unveils Command A Reasoning, a model for enterprise research and workflows." August 2025. https://the-decoder.com/cohere-unveils-command-a-reasoning-a-model-for-enterprise-research-and-workflows/ ↩
Cohere. "Command A: Cohere's Most Performant Model to Date." Cohere Blog, March 2025. https://cohere.com/blog/command-a ↩
VentureBeat. "Don't sleep on Cohere: Command A Reasoning, its first reasoning model, is built for enterprise customer service and more." August 22, 2025. https://venturebeat.com/ai/dont-sleep-on-cohere-command-a-reasoning-its-first-reasoning-model-is-built-for-enterprise-customer-service-and-more ↩
Cohere. "An Overview of Cohere's Models." Cohere Documentation. https://docs.cohere.com/docs/models ↩
MarkTechPost. "Cohere Released Command A: A 111B Parameter AI Model with 256K Context Length, 23-Language Support, and 50% Cost Reduction for Enterprises." March 16, 2025. https://www.marktechpost.com/2025/03/16/cohere-released-command-a-a-111b-parameter-ai-model-with-256k-context-length-23-language-support-and-50-cost-reduction-for-enterprises/ ↩
Cohere Labs. "Cohere Labs Acceptable Use Policy." Cohere Documentation. https://docs.cohere.com/docs/cohere-labs-acceptable-use-policy ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

Suggest edit

What links here

Cohere Command A