Kimi K2.6

Chinese AI Large Language Models Open Source AI

10 min read

Updated Jun 28, 2026

Suggest edit History Talk

RawGraph

Last edited

Jun 28, 2026

Fact-checked

In review queue

Sources

12 citations

Revision

v2 · 1,982 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

Kimi K2.6 is an open-weight, trillion-parameter mixture of experts (MoE) large language model released by Moonshot AI on 20 April 2026 for agentic coding and long-horizon autonomous execution. At launch it was the strongest publicly available open-weights model, ranking #1 among open models and #4 overall on the Artificial Analysis Intelligence Index with a score of 54, trailing only Anthropic, Google, and OpenAI (all 57).^[1]^[2]^[3] Its headline feature is Agent Swarm, which lets a single request fan out across as many as 300 parallel sub-agents executing up to 4,000 coordinated steps.^[1]^[5]^[12]

Kimi K2.6 succeeds Kimi K2, the July 2025 base release, the reasoning-focused Kimi K2 Thinking, and the Kimi K2.5 update. Unlike the earlier text-only K2 models, K2.6 is natively multimodal, accepting image and video input alongside text, and it ships with a configurable reasoning mode inherited from the Thinking line.^[1]^[4]

What is Kimi K2.6?

Kimi K2.6 combines a very large MoE backbone (roughly 1 trillion total parameters with about 32 billion active per token) with native vision support and an agent-orchestration layer aimed at production software work. Moonshot describes the model as built for "state-of-the-art coding, long-horizon execution, and agent swarm capabilities," emphasizing end-to-end coding, code-driven interface design, and the ability to run autonomously for extended sessions.^[1]^[12] The headline feature is its Agent Swarm capability, which lets a single request fan out across as many as 300 parallel sub-agents executing up to 4,000 coordinated steps.^[1]^[5]^[12]

The model is distributed under a permissive license with weights published on Hugging Face, and it is served through Moonshot's own API as well as third-party platforms including Cloudflare Workers AI and Microsoft Foundry. As with prior releases in the series, Moonshot priced inference well below comparable proprietary frontier models, a positioning that featured prominently in launch coverage.^[2]^[3]^[6]^[7]

Who makes Kimi K2.6 and how does it fit the Kimi line?

Moonshot AI is a Beijing-based artificial intelligence company that develops the Kimi family of models and the Kimi assistant. Its K2 series established Moonshot as a leading developer of open-weight frontier models. The original Kimi K2 shipped in mid-2025 as a 1-trillion-parameter MoE model with 32 billion active parameters, released in base and instruction-tuned variants. Kimi K2 Thinking extended that backbone with an explicit chain-of-thought reasoning mode and interleaved tool calling, and it introduced native INT4 quantization for the series. The intermediate Kimi K2.5 release advanced agentic and coding performance and brought an earlier version of the agent-swarm system.^[1]^[4]

Kimi K2.6 keeps the same overall MoE configuration of roughly 1T total and 32B active parameters as its predecessors, while adding native multimodality, a substantially larger context window than the original K2, and a scaled-up swarm system.^[1]^[2]^[4]

When was Kimi K2.6 released?

Moonshot released Kimi K2.6 on 20 April 2026, publishing the open weights on Hugging Face and making the model available through its API at the same time. The release date is corroborated by Cloudflare, which listed the model on Workers AI on the same day, by Moonshot's own announcement coverage, and by independent launch reporting. Artificial Analysis dated its own analysis to 21 April 2026.^[2]^[3]^[8]^[12]

The launch was accompanied by integrations across multiple inference providers and developer tools, and the model quickly accumulated millions of downloads on Hugging Face in its first month.^[1]^[6]

Architecture and sizes

Kimi K2.6 is a sparse mixture-of-experts transformer. Moonshot's published model card discloses the following configuration.^[1]

Component	Value
Total parameters	~1 trillion
Active parameters per token	~32 billion
Number of layers	61 (including 1 dense layer)
Number of experts	384
Selected experts per token	8
Shared experts	1
Attention heads	64
Attention mechanism	Multi-head latent attention (MLA)
Attention hidden dimension	7,168
MoE hidden dimension (per expert)	2,048
Activation function	SwiGLU
Vision encoder	MoonViT (~400M parameters)
Vocabulary size	~160,000
Context length	256K (262,144 tokens)
Native quantization	INT4

The model accepts text, image, and video inputs in a single unified architecture rather than bolting a vision module onto a text core, with the MoonViT encoder handling visual tokens. Native INT4 quantization, carried over from Kimi K2 Thinking, is intended to reduce memory footprint and improve inference throughput without a separate post-training compression step. Moonshot recommends serving the model with inference engines such as vLLM, SGLang, and KTransformers.^[1]

Like K2 Thinking, K2.6 supports an interleaved "thinking" mode and can preserve its reasoning content across multi-turn interactions, alongside an "instant" mode for lower-latency responses. Moonshot's recommended sampling settings differ by mode, with a higher temperature suggested for thinking mode and a lower one for instant mode.^[1]

What can Kimi K2.6 do?

Moonshot positions Kimi K2.6 primarily as an agentic and coding model. Its documented strengths include:

Long-horizon coding. End-to-end software tasks across multiple programming languages, including Rust, Go, and Python, spanning front-end, DevOps, and performance work.^[1]^[5]^[12]
Coding-driven design. Turning text prompts and visual references into production-ready interfaces and lightweight full-stack workflows, generating structured layouts, interactive elements, and animations. Moonshot says the model can "turn simple prompts into complete front-end interfaces."^[1]^[5]^[12]
Agent Swarm. Horizontal scaling to as many as 300 specialized sub-agents executing up to 4,000 coordinated steps in a single run, decomposing a task into parallel, domain-specialized subtasks that produce documents, websites, or spreadsheets autonomously. This roughly triples the sub-agent count and more than doubles the step budget relative to the swarm system in Kimi K2.5, which Moonshot describes as "scaling out, not just up."^[1]^[5]^[12]
Reasoning and tool use. A configurable thinking mode with interleaved, multi-step tool calling and structured outputs, plus native vision for document and image understanding.^[1]^[4]

Launch coverage also reported extended autonomous operation, with some sources citing autonomous runs lasting well beyond ten hours, though such figures come largely from Moonshot's own materials and early integrations rather than independent measurement.^[5]

How does Kimi K2.6 perform?

The scores below are drawn from Moonshot's published model card unless otherwise noted. As is typical for a launch, several figures are self-reported, and some agentic benchmarks are sensitive to the evaluation harness used; independent third-party replication was still in progress at release.^[1]^[9]

Benchmark	Kimi K2.6	Notes / source
Artificial Analysis Intelligence Index (v4.0)	54	#1 open-weights; #4 overall, behind models scoring 57 ^[2]
AIME 2026	96.4	Math reasoning ^[1]
GPQA Diamond	90.5	Graduate-level science ^[1]
LiveCodeBench (v6)	89.6	Competitive coding ^[1]
SWE-bench Verified	80.2	Software engineering ^[1]
SWE-Bench Pro	58.6	Real-world agentic SWE ^[1]
Terminal-Bench 2.0	66.7	Terminal/agent coding ^[1]
Humanity's Last Exam (with tools)	54.0	Leading among compared models ^[2]^[9]
GDPval-AA (Elo)	1520	Up from K2.5's 1309 ^[2]
τ²-Bench Telecom	96	Tool use ^[2]
BrowseComp	86.3 (Agent Swarm)	Web browsing/research ^[1]
DeepSearchQA (F1)	92.5	Deep research ^[1]
MMMU-Pro	79.4	Multimodal understanding ^[1]
MathVision (with Python)	93.2	Visual math ^[1]

Artificial Analysis reported that K2.6's agentic performance rose sharply over the prior release, with its GDPval-AA Elo climbing to 1520 from Kimi K2.5's 1309, while it "maintained a 96% score on τ²-Bench Telecom," placing it among frontier models on tool use. To complete the full Intelligence Index, K2.6 consumed roughly 160 million reasoning tokens.^[2]

On SWE-Bench Pro, a benchmark designed to track real agentic production work, multiple sources reported K2.6 ahead of or level with leading proprietary models at the time of launch.^[9]^[10]^[12]

Model	SWE-Bench Pro
Kimi K2.6	58.6
GPT-5.4	57.7
Claude Opus 4.6	53.4

On pure reasoning benchmarks the picture was closer, with some proprietary models still ahead. For example, on AIME 2026 GPT-5.4 was reported at 99.2 versus K2.6's 96.4, and on GPQA Diamond at 92.8 versus 90.5.^[9] Artificial Analysis's own composite placed K2.6 at index 54, the top open-weights score but a few points behind the strongest proprietary creators at 57.^[2]

Is Kimi K2.6 open source, and how much does it cost?

Kimi K2.6 is released under a Modified MIT License, the same permissive license Moonshot has used across the K2 series. The weights are openly downloadable, and the model is offered both through Moonshot's first-party API and via several third-party inference platforms.^[1]^[7]

Aspect	Detail
License	Modified MIT License ^[1]
Weights	`moonshotai/Kimi-K2.6` on Hugging Face ^[1]
Official API	platform.moonshot.ai / platform.kimi.ai (OpenAI- and Anthropic-compatible) ^[1]^[11]
Chat interface	kimi.com ^[1]
Third-party hosting	Cloudflare Workers AI, Microsoft Foundry, and others ^[3]^[7]
API price, input (cache miss)	$0.95 per 1M tokens ^[11]
API price, input (cache hit)	$0.16 per 1M tokens ^[11]
API price, output	$4.00 per 1M tokens ^[11]

The official input and output prices place K2.6 well below comparable proprietary frontier models, a point emphasized across launch coverage; reported prices on resellers and aggregators vary, and Artificial Analysis listed a blended input figure of about $0.95 per 1M tokens with output at $4.00.^[2]^[6]^[11]

Reception

Early reception focused on Kimi K2.6's standing as the leading open-weights model and on its aggressive price-to-performance ratio for coding and agentic work. Artificial Analysis stated plainly that "Moonshot's Kimi K2.6 is the new leading open weights model," noting it landed at #4 on its Intelligence Index (54) "behind only Anthropic, Google, and OpenAI (all 57)."^[2]

Commentators highlighted the Agent Swarm system and the model's coding results, particularly the SWE-Bench Pro figure, as evidence that open-weights models had reached parity with or surpassed proprietary frontier models on some agentic software tasks. Several writers framed the release as a notable moment for open models in the coding-agent space.^[5]^[6]^[10]^[12]

What are the limitations of Kimi K2.6?

Independent observers cautioned that many of the launch benchmark numbers were self-reported and that full third-party validation would take time. Agentic benchmarks such as Terminal-Bench in particular are harness-dependent, and reported scores for competing models varied substantially depending on the evaluation setup, complicating direct comparisons.^[9]

Artificial Analysis also noted that, despite a markedly lower hallucination rate of 39% (down from Kimi K2.5's 65%) on its knowledge benchmark, K2.6 still produced incorrect answers on a substantial fraction of factual questions, and its overall intelligence score remained a few points below the leading proprietary models.^[2] As with any very large MoE model, practical deployment of the full-precision or even INT4 weights demands significant hardware, which constrains fully local use to well-resourced setups.^[2]

References

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

1 revision by 1 contributors · full history

Suggest edit

What links here

AI Model Release Timeline (2022-2026)AIME 2025 Best AI Models for Reasoning and Math Best Open-Source LLMs BrowseComp Deep Agents DeepSeek vs Llama vs Qwen GLM-5.2 Kimi K2.5 LLM Benchmark Comparison (Leaderboard Overview)LLM Comparisons LLM Context Window Comparison LLM Size and Parameter Comparison NVIDIA Dynamo

What is Kimi K2.6?

Who makes Kimi K2.6 and how does it fit the Kimi line?

When was Kimi K2.6 released?

Architecture and sizes

What can Kimi K2.6 do?

How does Kimi K2.6 perform?

Is Kimi K2.6 open source, and how much does it cost?

Reception

What are the limitations of Kimi K2.6?

See also

References

Improve this article

Related Articles

Qwen

DeepSeek V4

Kimi K2

DeepSeek V3

Hunyuan

GLM-4.5

What links here

Related Articles

Qwen

DeepSeek V4

Kimi K2

DeepSeek V3

Hunyuan

GLM-4.5

What links here