Open weights
Last reviewed
May 24, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 · 4,778 words
Improve this article
Add missing citations, update stale details, or suggest a clearer explanation.
Last reviewed
May 24, 2026
Sources
No citations yet
Review status
Needs citations
Revision
v1 · 4,778 words
Add missing citations, update stale details, or suggest a clearer explanation.
Open weights, also rendered as open-weight models, refers to the practice of publishing the trained parameter tensors of a neural network so that anyone can download, run, fine-tune, and redistribute the model without obtaining a separate inference license from the developer. The term came into wide use in 2023 and 2024 to distinguish releases such as Meta's Llama 2, Mistral 7B, and DeepSeek V3 from both fully proprietary systems and from "open source AI" in the strict sense codified by the Open Source Initiative in October 2024.[1][2] An open-weights release typically includes a model card and the weight files in a format like safetensors or GGUF, but it generally omits the training corpus, the full training code, the data filtering pipeline, and the exact hyperparameters, all of which the Open Source Initiative's definition requires for a system to qualify as open source.[1] The model that is most widely cited as the moment open-weights releases became commercially mainstream is Llama 2 in July 2023, and the moment they reached frontier capability is generally placed in late 2024 and early 2025 with DeepSeek V3 and DeepSeek R1.[3][4][5]
Before the term "open weights" stabilized, the dominant phrasing in academic releases was simply "open access" or "publicly available," used for models like BLOOM and OPT-175B in 2022.[6] The earlier convention in deep learning research had been to publish the architecture and training code in an arXiv paper and a GitHub repository, with downloadable checkpoints treated as an implementation detail. When the parameter counts of generative language models reached the hundreds of billions, the weights themselves became valuable artefacts independent of the recipe used to produce them, since reproducing the training run could cost tens of millions of dollars. The phrase "open weights" emerged in 2023 specifically to capture this asymmetry: a developer might give the world the result of training without giving up the proprietary information needed to train it again.[2]
OpenAI's 2019 release of GPT-2 is often described as the first major case in which weights were treated as a separately governed asset. The company initially withheld the 1.5-billion-parameter version, citing concerns about synthetic text generation, and then released progressively larger checkpoints over 2019, with the full model published in November of that year.[7] The "staged release" terminology used in that decision became part of the policy vocabulary that later debates about Llama and DeepSeek inherited.
EleutherAI, a volunteer research collective, produced the first generation of replications of large autoregressive models with publicly downloadable weights. The group released GPT-Neo (125M, 1.3B, and 2.7B parameters) in March 2021, GPT-J-6B on 9 June 2021, and GPT-NeoX-20B on 10 February 2022, all under the Apache 2.0 license.[8] BigScience, a multi-institutional collaboration coordinated by Hugging Face and funded by the French government, trained BLOOM, a 176-billion-parameter multilingual model, on the Jean Zay supercomputer between March and July 2022 and released it in July 2022 under a Responsible AI License with use restrictions and made the source code available under Apache 2.0.[6][9] These early projects established the social and technical norms (model cards, Hugging Face Hub uploads, dataset documentation) that later commercial open-weights releases adopted.
Meta's first LLaMA model, released on 24 February 2023, was distributed to academic researchers under a noncommercial license; the model card and paper described a 7B, 13B, 33B, and 65B family trained on 1.4 trillion tokens.[10] Within a week the weights leaked on 4chan and proliferated across BitTorrent and Hugging Face mirrors.[10] Rather than litigate the leak indefinitely, Meta accepted the new state of affairs and on 18 July 2023 released Llama 2 (7B, 13B, and 70B parameters) under the bespoke Llama 2 Community License, which permits commercial use but requires companies with more than 700 million monthly active users to obtain a separate paid license from Meta.[3][11] The Open Source Initiative declined to list the Llama 2 license among approved open-source licenses, and several commentators argued that the 700M cap, the additional acceptable-use policy, and the restriction on using outputs to train competing models made the license a "source-available" rather than open-source instrument.[11]
Llama 3, released on 18 April 2024, expanded to 15 trillion training tokens, an 8B and 70B size, and a 128,000-token vocabulary; it kept the same community license framework.[12] Llama 3.1 followed on 23 July 2024, adding a 405-billion-parameter dense Transformer that was widely described as the first frontier-class open-weights model. Llama 3.1 405B used a 128,000-token context window and was trained on more than 15 trillion tokens on a cluster of more than 16,000 Nvidia H100 GPUs.[4] Llama 3.2, on 25 September 2024, added 1B and 3B text-only models for edge deployment and 11B and 90B vision-capable models, withheld in the European Union pending interpretation of the EU AI Act.[13]
Llama 4, announced on 5 April 2025, was Meta's first Mixture-of-Experts release. The family included Scout (17B active parameters across 16 experts, marketed with a 10-million-token context window), Maverick (17B active parameters across 128 experts, 400B total), and a Behemoth model with roughly 288B active parameters and approximately 2 trillion total parameters that Meta described as still in training at launch.[14] Mark Zuckerberg argued in an open letter accompanying Llama 3.1 that open weights would prevent concentration of AI in "a small few" hands, an explicit policy frame that critics inside and outside the OSI process disputed.[4][11]
Mistral AI, a Paris-based startup founded in April 2023 by former Meta and Google DeepMind researchers, released Mistral 7B on 27 September 2023 under the Apache 2.0 license.[15] The 7.3-billion-parameter model used Grouped-Query Attention and Sliding-Window Attention, supported sequences up to 8,192 tokens, and on the MMLU and code benchmarks outperformed Llama 2 13B.[15] Because Apache 2.0 imposes no use restrictions, contains a patent grant, and is unambiguously OSI-approved for software, Mistral's release was widely treated as the first commercially significant model that was open-source by every common standard short of the OSI's 2024 AI-specific definition.[15]
On 11 December 2023, Mistral followed with Mixtral 8x7B, a sparse Mixture-of-Experts model with 46.7 billion total parameters and roughly 12.9 billion active parameters per token, also under Apache 2.0. Mixtral was the first MoE language model at that scale released with permissive open weights.[16] Subsequent Mistral releases adopted a tiered strategy: smaller and earlier models (Mistral 7B, Mixtral 8x7B, Mixtral 8x22B) under Apache 2.0, while later flagship models such as Mistral Large 2 (July 2024) used the more restrictive Mistral Research License that allows research but not commercial deployment, and Codestral (May 2024) used the Mistral AI Non-Production License, which permits research and testing but bars commercial use without a paid agreement.[17][18] This pattern, of releasing weights publicly under licenses that look open but contain field-of-use or competitor restrictions, became one of the recurrent flashpoints in the debate over what counts as "open."
DeepSeek, a research lab in Hangzhou backed by the quantitative-trading firm High-Flyer, published DeepSeek V3 weights and the accompanying technical report on 26 December 2024. The model is a 671-billion-parameter Mixture-of-Experts Transformer with 37 billion parameters activated per token, trained on 14.8 trillion tokens using 2.788 million Nvidia H800 GPU-hours.[5][19] The repository and weights are licensed under the MIT License with a separate model license that permits commercial use, modification, and redistribution, and the technical report described an auxiliary-loss-free load-balancing strategy, FP8 mixed-precision training, and a multi-token-prediction objective.[19]
DeepSeek R1, released on 20 January 2025, applied reinforcement learning to a V3-class base to produce a reasoning model whose math, code, and contest-problem performance was reported to rival OpenAI's o1.[20] The release included R1, R1-Zero (an RL-only ablation), and six distilled variants based on Llama and Qwen backbones with parameter counts from 1.5B to 70B, all under the MIT license.[20] The combination of frontier-level reasoning scores and permissive licensing prompted a measurable spike in Hugging Face downloads from China and was widely cited as the moment open-weights releases became fully competitive with closed flagship models on at least one benchmark axis.[21] In April 2026, DeepSeek published V4 weights on Hugging Face under an MIT-compatible license; V4 introduced a 1.6-trillion-parameter MoE variant and a smaller flash model with hybrid sparse-attention mechanisms.[22]
Alibaba's Qwen team has shipped open-weights models at a pace that few other developers match. Qwen 1.5 (early 2024), Qwen2 (June 2024), and Qwen2.5 (September 2024) released a wide ladder of dense models from 0.5B to 72B parameters under the Apache 2.0 license, with some larger variants using a Qwen Research License.[23] Qwen3, released on 28 April 2025, shipped six dense models (0.6B, 1.7B, 4B, 8B, 14B, and 32B) and two Mixture-of-Experts models (30B with 3B active and 235B with 22B active) under Apache 2.0, trained on 36 trillion tokens, with a hybrid "thinking" and "non-thinking" mode toggle for inference.[24] In November 2024 the Qwen team released QwQ-32B-Preview, a reasoning-focused 32B dense model under Apache 2.0 with claimed performance comparable to OpenAI's o1 on selected math and code benchmarks.[25] By 2025, Alibaba reported that more than 300 million downloads and over 100,000 derivative checkpoints based on Qwen had been published on Hugging Face.[24]
Microsoft's Phi series followed a different design philosophy, emphasizing small dense models trained on heavily curated and synthetic data. Phi-4, a 14-billion-parameter dense model, was published on Azure AI Foundry in December 2024 and then released to Hugging Face under the MIT license in early 2025, alongside Phi-4-reasoning and Phi-4-reasoning-plus, also MIT-licensed.[26] Google's Gemma family, announced in February 2024, started with 2B and 7B models under a custom Gemma license with use restrictions; Gemma 2 (2B, 9B, 27B) shipped later in 2024 and Gemma 3 followed in 2025, with Google moving some later releases toward more permissive terms.[27]
The Technology Innovation Institute in Abu Dhabi released Falcon-40B for research and commercial use on 25 May 2023 and Falcon-180B on 6 September 2023; both were among the earliest large open-weights models from outside the United States and China, and Falcon-180B was distributed under a bespoke license modeled on Apache 2.0 with additional terms governing hosted services.[28] 01.AI, founded by Kai-Fu Lee, released Yi-6B and Yi-34B base models on 2 November 2023, followed by 200K-context variants and chat models later that month; the project later moved toward Apache 2.0 terms.[29]
Zhipu AI in Beijing released GLM-4.5 and GLM-4.5-Air on 28 July 2025 under the MIT license, with 355 billion and 106 billion total parameters respectively and Mixture-of-Experts architectures aimed at agent workloads.[30] Moonshot AI, another Beijing lab, released Kimi K2, a one-trillion-parameter MoE model, on 11 July 2025 under a modified MIT license; subsequent K2-Thinking, K2.5, and K2.6 releases continued in late 2025 and early 2026.[31]
The licenses used for open-weights releases fall into a small number of recognizable families.
| License family | Typical terms | Representative models |
|---|---|---|
| Apache 2.0 / MIT | Permissive, no field-of-use limits, patent grant (Apache) | Mistral 7B, Mixtral, DeepSeek V3, DeepSeek R1, Phi-4, Qwen3, GLM-4.5, Falcon-40B (with addendum), EleutherAI models |
| Llama Community License | Commercial use permitted; 700M monthly active user threshold requires separate paid license; restrictions on training competing models | Llama 2, Llama 3, Llama 3.1, Llama 3.2, Llama 4 |
| Research-only / non-production | Use limited to research and evaluation; commercial deployment requires a paid agreement | Mistral Large 2 (MRL), Codestral (MNPL), some Qwen flagship variants |
| Open RAIL family | Permissive distribution with use-based restrictions on harmful applications | BLOOM, OpenRAIL-M derivatives |
| Custom commercial licenses | Bespoke terms, often with addendum for hosted services or geography | Falcon-180B, Gemma 1 and 2 |
The split between truly permissive licenses (Apache 2.0 and MIT) and the Llama Community License is the most consequential dividing line for downstream use, because the OSI does not recognize the Llama license as open source and several large hosting providers built commercial offerings on Llama-derived models without the 700M-MAU exposure being fully resolved.[11] The Mistral Research License and Codestral's non-production license further illustrate how "open weights" can be combined with terms that block specific commercial scenarios, a model sometimes characterized by critics as "openwashing."[17]
The Open Source Initiative, the steward of the Open Source Definition for software since 1998, ran a two-year multi-stakeholder process to extend that definition to AI systems. On 28 October 2024, the OSI published the Open Source AI Definition (OSAID) version 1.0 at the All Things Open conference in Raleigh, North Carolina.[1][2] OSAID 1.0 requires that an open-source AI system grant four freedoms (use, study, modify, and share) for any purpose, and it specifies that the "preferred form" for modification must include the complete source code used to train and run the system, the model parameters, and "sufficiently detailed information about the data used to train the system so that a skilled person can build a substantially equivalent system."[1]
The training-data requirement, rather than mandating release of the corpus itself, mandates release of enough description (provenance, scope, characteristics, labeling procedures, publicly available sources, and third-party access pathways) to allow a reasonably skilled team to assemble a substitute corpus.[1] Critics, including parts of the Free Software Foundation community, argued that this concession on training data fell short of true openness; defenders argued that strict corpus-release requirements would prohibit any practical AI system because of copyright, privacy, and contractual constraints on web-scraped text.[1] By the OSAID 1.0 standard, almost all widely cited "open" models, including the entire Llama family and most Chinese open-weights releases, fail to qualify as open source because they do not document the training corpus to that level of detail.[11]
The practical distinction between "open weights" and "open source AI" can be summarized along several dimensions.
| Dimension | Open weights (typical) | Open source AI (OSAID 1.0) |
|---|---|---|
| Weight files | Downloadable from Hugging Face or developer site | Required |
| Inference code | Usually published under permissive license | Required |
| Training code | Often partial or omitted | Required in full |
| Training data | Usually not released, often not described in detail | Detailed description required, including provenance |
| Hyperparameters | Sometimes omitted | Required |
| License | Variable: Apache 2.0, MIT, Llama community, RAIL, custom | OSI-approved AI license |
| Field-of-use restrictions | Sometimes present (Llama 700M cap, Codestral non-production, RAIL behavioral restrictions) | Prohibited |
Because the OSI's definition is the only standardized one for AI, "open weights" has become the conventional umbrella term that covers all of these variants without committing to any single legal characterization.[2] Several developers, notably Meta and Mistral, market their releases as "open source" in public communications, a positioning that the OSI and several civil-society groups have criticized as obscuring the difference.[11]
The Hugging Face Hub functions as the de facto distribution platform for open-weights models. By 2025, the Hub hosted more than 2 million public models, more than 500,000 public datasets, and roughly 13 million registered users, with the State of Open Source on Hugging Face report describing average downloads on the order of 15 million model files per day.[32] Concentration is high: roughly half of all downloads target the top 200 model entries, and roughly 92 percent of downloads target models with fewer than one billion parameters, reflecting both the popularity of small fine-tuned classifiers and the prevalence of quantized variants of larger systems.[32] Chinese-developed models accounted for roughly 41 percent of Hub downloads by 2025, a shift Hugging Face attributed largely to the viral spread of DeepSeek R1 and subsequent Qwen, GLM, and Kimi releases.[32]
Beyond the Hub, open-weights models have driven the emergence of a consumer-facing local inference ecosystem. Quantization formats such as GGUF and runtimes such as llama.cpp, Ollama, and LM Studio rely on open-weights releases as their input. Cloud providers also offer hosted inference for open-weights families: Meta partnered with Microsoft Azure, Amazon Web Services, and several smaller serving platforms for Llama distribution.[3][4] Fine-tuning libraries (PEFT, Unsloth, Axolotl) and parameter-efficient adapter formats (LoRA, QLoRA) have built large user communities specifically around open-weights base models.
The release of Llama 2 in July 2023 reopened a question that GPT-2's staged release had first surfaced four years earlier: whether the distribution of strong generative model weights is itself a public risk. A group of researchers and policy advocates argued that open weights enabled removal of safety fine-tuning, dual-use chemical and biological information retrieval, and the production of non-consensual imagery, and that frontier weights should be treated more like nuclear material than like software.[33] A counterposition, prominent at the National Telecommunications and Information Administration (NTIA), Stanford's Center for Research on Foundation Models, and various civil-society groups, argued that the marginal uplift from open weights over closed APIs was modest, that open weights diversified the field of researchers studying safety, and that pre-emptive restrictions would entrench the position of a small number of US-based labs.[33][34]
The Biden administration's October 2023 executive order on AI used the phrase "dual-use foundation models with widely available weights" to refer to open-weights systems and directed the NTIA to study the question and report back. The NTIA released its Report on Dual-Use Foundation Models with Widely Available Model Weights on 30 July 2024, after a public consultation period that drew 332 comments. The report recommended that the federal government build capacity to monitor risks associated with open-weights releases but declined to recommend immediate restrictions, citing the benefits of decentralization, data confidentiality, and broader research participation.[33] The Biden executive order was rescinded by the incoming administration in January 2025, and a successor framework drafted in 2025 left the open-weights question to agency-level monitoring without imposing new pre-release controls.[33]
The Center for AI Safety's May 2023 single-sentence statement on extinction risk, signed by figures including Geoffrey Hinton, Yoshua Bengio, Sam Altman, Dario Amodei, and Demis Hassabis, framed catastrophic risk from advanced AI as comparable to pandemics and nuclear war.[35] The statement did not directly address open-weights policy, but it shaped a parallel debate about whether the most capable systems should be released to the public at all. Anthropic's 2024 paper "Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training" demonstrated that backdoor behavior introduced during fine-tuning could survive supervised fine-tuning, reinforcement learning from human feedback, and adversarial training, particularly in larger models. Critics of open releases cited this work to argue that downstream consumers of open-weights models cannot verify that a checkpoint has not been tampered with by the upstream provider, while defenders argued that the same paper showed that backdoors could be inserted into closed models too.[36]
The Stanford Foundation Model Transparency Index, launched in October 2023 by researchers at the Center for Research on Foundation Models and the Institute for Human-Centered AI together with collaborators at MIT Media Lab and Princeton's Center for Information Technology Policy, measured 100 indicators of transparency across upstream resources, model properties, and downstream impact. In the inaugural assessment the average score across 10 leading developers was 37 percent, and Hugging Face's BLOOMZ and Meta's Llama 2 ranked among the most transparent of the cohort, reflecting the contribution that open-weights releases make to transparency even when they fall short of full open-source standards. The Index's 2025 edition, however, found that the average across developers had fallen from 58 in 2024 to 40 in 2025, with training data and training compute remaining the most opaque dimensions.[34]
The European Union's AI Act, adopted in 2024, includes specific carve-outs for "general purpose AI models released under free and open source licenses," reducing the documentation burden for compliance with the Act provided that the model parameters and architecture are publicly accessible and the license imposes no use restrictions; models classified as posing "systemic risk" by virtue of their training compute do not benefit from this exception.[37] The interaction between the AI Act's open-source carve-out and the OSI's OSAID 1.0 definition is unresolved; the Act predates OSAID and uses its own informal criteria. The EU AI Act's implementation timeline has driven several developers, including Meta with Llama 3.2 multimodal variants, to delay or withhold releases in EU territory.[13][37]
Through 2023 and most of 2024, "open weights" referred almost exclusively to instruction-tuned base models, while the new category of reasoning models (OpenAI's o1 and o3, Anthropic's extended thinking) remained closed. That changed with DeepSeek R1 in January 2025, which combined a 671B MoE base, large-scale reinforcement learning with verifiable rewards, and distillation into smaller Llama and Qwen backbones, all released under the MIT license.[20] Within weeks, Qwen QwQ-32B-Preview (open-weighted under Apache 2.0 in November 2024) and Microsoft Phi-4-reasoning (MIT, 2025) joined the open reasoning-model field, and Mistral published its Magistral reasoning model in 2025 with an open-weights component.[25][26]
The distillation pipeline DeepSeek documented in the R1 paper, training a large reasoning model with reinforcement learning and then distilling its chains-of-thought into smaller base models, became a replicable blueprint. The R1-Distill checkpoints based on Llama 3.1 8B and Qwen 2.5 1.5B in particular received heavy downloads and produced a cottage industry of further fine-tunes published on the Hub.[20][21] The arrival of frontier-class reasoning models with permissive licenses prompted a reassessment within frontier-labs that previously treated reasoning training recipes as their primary moat.[21]
The major American frontier labs (OpenAI, Anthropic, Google DeepMind for most flagship models) have continued to release their most capable systems as closed-weights API products. Anthropic has been explicit that its non-proliferation rationale drives this choice, with senior leadership describing the risk of frontier weights being acquired by hostile state actors or organized criminal groups as the dominant consideration.[38] OpenAI's only open-weights releases in the language-model category remained the GPT-2 family from 2019 and Whisper speech models, with no open release of any of the GPT-3, GPT-4, or GPT-5 family weights through May 2026.[7][38] Google has shipped its Gemma series as open weights but kept Gemini models closed.[27]
The closed-weights labs argue that the gap between their flagship products and the best open-weights systems remains meaningful on benchmarks that combine reasoning, tool use, and multilingual coverage, but in 2025 and 2026 that gap narrowed substantially. DeepSeek R1, Qwen3, Kimi K2 Thinking, and GLM-4.5 each closed within single-digit percentage points of leading closed APIs on widely cited benchmarks like MMLU-Pro, GPQA, and SWE-Bench Verified, and on cost-per-token open-weights deployments routinely undercut closed APIs by an order of magnitude.[21][22][31] The closed-versus-open dynamic that observers expected to widen has instead oscillated, with each side leapfrogging the other across successive releases.
Several questions remain unsettled. The first is whether the OSI's OSAID 1.0 will achieve the same regulatory standing in AI that the Open Source Definition holds in software; as of mid-2026 no major jurisdiction has explicitly adopted OSAID by reference, and the most influential open-weights licenses (Apache 2.0, MIT, the Llama community license) predate it and were not designed against its criteria.[1][2] The second is the status of training-data documentation: even open-source-friendly developers like Mistral and DeepSeek have published only summary information about their training corpora, and reproducing a true substitute corpus from those descriptions remains practically impossible.[15][19]
A third question is whether open weights actually reduce concentration in the AI industry. Critics have observed that the cost of producing competitive frontier weights has continued to rise, with DeepSeek V3 reportedly costing $5.6 million in GPU time to train but requiring infrastructure investments that few groups outside very large companies can muster; the resulting open releases, while widely usable, originate from a small set of well-resourced labs.[19][21] A fourth concerns the durability of permissive licensing: some developers, including Mistral and Stability AI, have shifted later releases away from Apache 2.0 toward more restrictive commercial-non-production schemes, indicating that permissive licensing may have been a market-entry tactic rather than a stable policy.[17][18] A fifth is the question of safety: the technical literature on red-teaming open-weights models (jailbreaks, fine-tuning attacks, weight tampering for backdoor insertion) has continued to mature, but the policy literature has not converged on a stable threshold above which a model should not be released openly.[33][36]
The term "open weights" itself is contested. Some open-source advocates argue that it legitimizes "openwashing" by giving a respectable label to releases that fail to meet the substantive transparency standards of open-source software. Other observers argue that the term is descriptively useful precisely because it does not commit to any single license or definition and lets users see the actual terms each developer offers. By 2026 the phrase had become the standard industry term, used routinely in NTIA documents, EU AI Act compliance guidance, Hugging Face product copy, and the press releases of every major developer that publishes downloadable model parameters.[2][33][32]