# Kimi K2.6

> Source: https://aiwiki.ai/wiki/kimi_k2_6
> Updated: 2026-06-28
> Categories: Chinese AI, Large Language Models, Open Source AI
> From AI Wiki (https://aiwiki.ai), a free encyclopedia of artificial intelligence. Quote with attribution.

**Kimi K2.6** is an open-weight, trillion-parameter [mixture of experts](/wiki/mixture_of_experts) (MoE) large language model released by [Moonshot AI](/wiki/moonshot_ai) on 20 April 2026 for agentic coding and long-horizon autonomous execution. At launch it was the strongest publicly available open-weights model, ranking #1 among open models and #4 overall on the Artificial Analysis Intelligence Index with a score of 54, trailing only Anthropic, Google, and OpenAI (all 57).[1][2][3] Its headline feature is Agent Swarm, which lets a single request fan out across as many as 300 parallel sub-agents executing up to 4,000 coordinated steps.[1][5][12]

Kimi K2.6 succeeds [Kimi K2](/wiki/kimi_k2), the July 2025 base release, the reasoning-focused [Kimi K2 Thinking](/wiki/kimi_k2_thinking), and the [Kimi K2.5](/wiki/kimi_k2_5) update. Unlike the earlier text-only K2 models, K2.6 is natively multimodal, accepting image and video input alongside text, and it ships with a configurable reasoning mode inherited from the Thinking line.[1][4]

## What is Kimi K2.6?

Kimi K2.6 combines a very large MoE backbone (roughly 1 trillion total parameters with about 32 billion active per token) with native vision support and an agent-orchestration layer aimed at production software work. Moonshot describes the model as built for "state-of-the-art coding, long-horizon execution, and agent swarm capabilities," emphasizing end-to-end coding, code-driven interface design, and the ability to run autonomously for extended sessions.[1][12] The headline feature is its Agent Swarm capability, which lets a single request fan out across as many as 300 parallel sub-agents executing up to 4,000 coordinated steps.[1][5][12]

The model is distributed under a permissive license with weights published on [Hugging Face](/wiki/hugging_face), and it is served through Moonshot's own API as well as third-party platforms including Cloudflare Workers AI and Microsoft Foundry. As with prior releases in the series, Moonshot priced inference well below comparable proprietary frontier models, a positioning that featured prominently in launch coverage.[2][3][6][7]

## Who makes Kimi K2.6 and how does it fit the Kimi line?

Moonshot AI is a Beijing-based artificial intelligence company that develops the Kimi family of models and the Kimi assistant. Its K2 series established Moonshot as a leading developer of open-weight frontier models. The original [Kimi K2](/wiki/kimi_k2) shipped in mid-2025 as a 1-trillion-parameter MoE model with 32 billion active parameters, released in base and instruction-tuned variants. [Kimi K2 Thinking](/wiki/kimi_k2_thinking) extended that backbone with an explicit chain-of-thought reasoning mode and interleaved tool calling, and it introduced native INT4 quantization for the series. The intermediate [Kimi K2.5](/wiki/kimi_k2_5) release advanced agentic and coding performance and brought an earlier version of the agent-swarm system.[1][4]

Kimi K2.6 keeps the same overall MoE configuration of roughly 1T total and 32B active parameters as its predecessors, while adding native multimodality, a substantially larger context window than the original K2, and a scaled-up swarm system.[1][2][4]

## When was Kimi K2.6 released?

Moonshot released Kimi K2.6 on 20 April 2026, publishing the open weights on Hugging Face and making the model available through its API at the same time. The release date is corroborated by Cloudflare, which listed the model on Workers AI on the same day, by Moonshot's own announcement coverage, and by independent launch reporting. Artificial Analysis dated its own analysis to 21 April 2026.[2][3][8][12]

The launch was accompanied by integrations across multiple inference providers and developer tools, and the model quickly accumulated millions of downloads on Hugging Face in its first month.[1][6]

## Architecture and sizes

Kimi K2.6 is a sparse mixture-of-experts transformer. Moonshot's published model card discloses the following configuration.[1]

| Component | Value |
| --- | --- |
| Total parameters | ~1 trillion |
| Active parameters per token | ~32 billion |
| Number of layers | 61 (including 1 dense layer) |
| Number of experts | 384 |
| Selected experts per token | 8 |
| Shared experts | 1 |
| Attention heads | 64 |
| Attention mechanism | Multi-head latent attention (MLA) |
| Attention hidden dimension | 7,168 |
| MoE hidden dimension (per expert) | 2,048 |
| Activation function | SwiGLU |
| Vision encoder | MoonViT (~400M parameters) |
| Vocabulary size | ~160,000 |
| Context length | 256K (262,144 tokens) |
| Native quantization | INT4 |

The model accepts text, image, and video inputs in a single unified architecture rather than bolting a vision module onto a text core, with the MoonViT encoder handling visual tokens. Native INT4 quantization, carried over from [Kimi K2 Thinking](/wiki/kimi_k2_thinking), is intended to reduce memory footprint and improve inference throughput without a separate post-training compression step. Moonshot recommends serving the model with inference engines such as [vLLM](/wiki/vllm), [SGLang](/wiki/sglang), and KTransformers.[1]

Like K2 Thinking, K2.6 supports an interleaved "thinking" mode and can preserve its reasoning content across multi-turn interactions, alongside an "instant" mode for lower-latency responses. Moonshot's recommended sampling settings differ by mode, with a higher temperature suggested for thinking mode and a lower one for instant mode.[1]

## What can Kimi K2.6 do?

Moonshot positions Kimi K2.6 primarily as an agentic and coding model. Its documented strengths include:

- **Long-horizon coding.** End-to-end software tasks across multiple programming languages, including Rust, Go, and Python, spanning front-end, DevOps, and performance work.[1][5][12]
- **Coding-driven design.** Turning text prompts and visual references into production-ready interfaces and lightweight full-stack workflows, generating structured layouts, interactive elements, and animations. Moonshot says the model can "turn simple prompts into complete front-end interfaces."[1][5][12]
- **Agent Swarm.** Horizontal scaling to as many as 300 specialized sub-agents executing up to 4,000 coordinated steps in a single run, decomposing a task into parallel, domain-specialized subtasks that produce documents, websites, or spreadsheets autonomously. This roughly triples the sub-agent count and more than doubles the step budget relative to the swarm system in [Kimi K2.5](/wiki/kimi_k2_5), which Moonshot describes as "scaling out, not just up."[1][5][12]
- **Reasoning and tool use.** A configurable thinking mode with interleaved, multi-step tool calling and structured outputs, plus native vision for document and image understanding.[1][4]

Launch coverage also reported extended autonomous operation, with some sources citing autonomous runs lasting well beyond ten hours, though such figures come largely from Moonshot's own materials and early integrations rather than independent measurement.[5]

## How does Kimi K2.6 perform?

The scores below are drawn from Moonshot's published model card unless otherwise noted. As is typical for a launch, several figures are self-reported, and some agentic benchmarks are sensitive to the evaluation harness used; independent third-party replication was still in progress at release.[1][9]

| Benchmark | Kimi K2.6 | Notes / source |
| --- | --- | --- |
| Artificial Analysis Intelligence Index (v4.0) | 54 | #1 open-weights; #4 overall, behind models scoring 57 [2] |
| AIME 2026 | 96.4 | Math reasoning [1] |
| GPQA Diamond | 90.5 | Graduate-level science [1] |
| LiveCodeBench (v6) | 89.6 | Competitive coding [1] |
| SWE-bench Verified | 80.2 | Software engineering [1] |
| SWE-Bench Pro | 58.6 | Real-world agentic SWE [1] |
| Terminal-Bench 2.0 | 66.7 | Terminal/agent coding [1] |
| Humanity's Last Exam (with tools) | 54.0 | Leading among compared models [2][9] |
| GDPval-AA (Elo) | 1520 | Up from K2.5's 1309 [2] |
| τ²-Bench Telecom | 96 | Tool use [2] |
| BrowseComp | 86.3 (Agent Swarm) | Web browsing/research [1] |
| DeepSearchQA (F1) | 92.5 | Deep research [1] |
| MMMU-Pro | 79.4 | Multimodal understanding [1] |
| MathVision (with Python) | 93.2 | Visual math [1] |

Artificial Analysis reported that K2.6's agentic performance rose sharply over the prior release, with its GDPval-AA Elo climbing to 1520 from Kimi K2.5's 1309, while it "maintained a 96% score on τ²-Bench Telecom," placing it among frontier models on tool use. To complete the full Intelligence Index, K2.6 consumed roughly 160 million reasoning tokens.[2]

On SWE-Bench Pro, a benchmark designed to track real agentic production work, multiple sources reported K2.6 ahead of or level with leading proprietary models at the time of launch.[9][10][12]

| Model | SWE-Bench Pro |
| --- | --- |
| Kimi K2.6 | 58.6 |
| GPT-5.4 | 57.7 |
| Claude Opus 4.6 | 53.4 |

On pure reasoning benchmarks the picture was closer, with some proprietary models still ahead. For example, on AIME 2026 GPT-5.4 was reported at 99.2 versus K2.6's 96.4, and on GPQA Diamond at 92.8 versus 90.5.[9] Artificial Analysis's own composite placed K2.6 at index 54, the top open-weights score but a few points behind the strongest proprietary creators at 57.[2]

## Is Kimi K2.6 open source, and how much does it cost?

Kimi K2.6 is released under a Modified MIT License, the same permissive license Moonshot has used across the K2 series. The weights are openly downloadable, and the model is offered both through Moonshot's first-party API and via several third-party inference platforms.[1][7]

| Aspect | Detail |
| --- | --- |
| License | Modified MIT License [1] |
| Weights | `moonshotai/Kimi-K2.6` on Hugging Face [1] |
| Official API | platform.moonshot.ai / platform.kimi.ai (OpenAI- and Anthropic-compatible) [1][11] |
| Chat interface | kimi.com [1] |
| Third-party hosting | Cloudflare Workers AI, Microsoft Foundry, and others [3][7] |
| API price, input (cache miss) | $0.95 per 1M tokens [11] |
| API price, input (cache hit) | $0.16 per 1M tokens [11] |
| API price, output | $4.00 per 1M tokens [11] |

The official input and output prices place K2.6 well below comparable proprietary frontier models, a point emphasized across launch coverage; reported prices on resellers and aggregators vary, and Artificial Analysis listed a blended input figure of about $0.95 per 1M tokens with output at $4.00.[2][6][11]

## Reception

Early reception focused on Kimi K2.6's standing as the leading open-weights model and on its aggressive price-to-performance ratio for coding and agentic work. Artificial Analysis stated plainly that "Moonshot's Kimi K2.6 is the new leading open weights model," noting it landed at #4 on its Intelligence Index (54) "behind only Anthropic, Google, and OpenAI (all 57)."[2]

Commentators highlighted the Agent Swarm system and the model's coding results, particularly the SWE-Bench Pro figure, as evidence that open-weights models had reached parity with or surpassed proprietary frontier models on some agentic software tasks. Several writers framed the release as a notable moment for open models in the coding-agent space.[5][6][10][12]

## What are the limitations of Kimi K2.6?

Independent observers cautioned that many of the launch benchmark numbers were self-reported and that full third-party validation would take time. Agentic benchmarks such as Terminal-Bench in particular are harness-dependent, and reported scores for competing models varied substantially depending on the evaluation setup, complicating direct comparisons.[9]

Artificial Analysis also noted that, despite a markedly lower hallucination rate of 39% (down from [Kimi K2.5](/wiki/kimi_k2_5)'s 65%) on its knowledge benchmark, K2.6 still produced incorrect answers on a substantial fraction of factual questions, and its overall intelligence score remained a few points below the leading proprietary models.[2] As with any very large MoE model, practical deployment of the full-precision or even INT4 weights demands significant hardware, which constrains fully local use to well-resourced setups.[2]

## See also

- [Kimi K2](/wiki/kimi_k2)
- [Kimi K2 Thinking](/wiki/kimi_k2_thinking)
- [Kimi K2.5](/wiki/kimi_k2_5)
- [Moonshot AI](/wiki/moonshot_ai)
- [Mixture of experts](/wiki/mixture_of_experts)
- [Open weights](/wiki/open_weights)

## References

1. [moonshotai/Kimi-K2.6 model card, Hugging Face](https://huggingface.co/moonshotai/Kimi-K2.6)
2. [Kimi K2.6: The new leading open weights model, Artificial Analysis](https://artificialanalysis.ai/articles/kimi-k2-6-the-new-leading-open-weights-model)
3. [Moonshot AI Kimi K2.6 now available on Workers AI, Cloudflare Changelog](https://developers.cloudflare.com/changelog/post/2026-04-20-kimi-k2-6-workers-ai/)
4. [Kimi K2.6 by Moonshot AI: Open-Weight Model, DataNorth](https://datanorth.ai/news/moonshot-ai-releases-kimi-k2-6)
5. [What Is Kimi K2.6? Moonshot AI's Open-Weight Agent Model Explained, Verdent Guides](https://www.verdent.ai/guides/what-is-kimi-k2-6)
6. [Kimi K2.6: 1T MoE Open Weights, Agent Swarm, Pricing (2026), Codersera](https://codersera.com/blog/kimi-k2-6-complete-guide-2026/)
7. [kimi-k2.6 (Moonshot AI), Cloudflare Workers AI docs](https://developers.cloudflare.com/workers-ai/models/kimi-k2.6/)
8. [Moonshot AI Kimi K2.6 now available on Workers AI (changelog), Cloudflare](https://developers.cloudflare.com/changelog/post/2026-04-20-kimi-k2-6-workers-ai/)
9. [Kimi K2.6 Review and Benchmarks, BuildFastWithAI](https://www.buildfastwithai.com/blogs/kimi-k2-6-review-benchmarks)
10. [Kimi K2.6 Has Arrived: An Open-Weight Powerhouse for Agentic Work, Kilo](https://blog.kilo.ai/p/kimi-k26-has-arrived-an-open-weight)
11. [Kimi K2.6 API pricing, platform.kimi.ai](https://platform.kimi.ai/docs/pricing/chat-k26)
12. [Moonshot AI Releases Kimi K2.6 with Long-Horizon Coding, Agent Swarm Scaling to 300 Sub-Agents and 4,000 Coordinated Steps, MarkTechPost](https://www.marktechpost.com/2026/04/20/moonshot-ai-releases-kimi-k2-6-with-long-horizon-coding-agent-swarm-scaling-to-300-sub-agents-and-4000-coordinated-steps/)

