LLaMA/Model Card

AI Models Developer Tools Large Language Models Meta AI

12 min read

Updated Jul 16, 2026

Suggest edit History Talk

RawGraph

Last edited

Jul 16, 2026

Fact-checked

In review queue

Sources

17 citations

Revision

v3 · 2,493 words

Fact-checks are independent of edits: a reviewer re-verifies the article against its sources and stamps the date. How we verify

See also: LLaMA, Llama 2, Llama 3, Llama 4, Model card

The Llama model card is the official documentation that Meta AI ships with each release of the Llama family of large language models. The first card appeared in February 2023 inside facebookresearch/llama as MODEL_CARD.md, and every subsequent generation (Llama 2 in July 2023, Llama 3 in April 2024, Llama 3.1, 3.2, and 3.3 across late 2024, Llama 4 in April 2025) has shipped its own.^[2] The cards follow the structure from Mitchell et al.'s 2019 Model Cards for Model Reporting paper,^[9] and they are a primary reference for what each Llama checkpoint was trained on, what it can be used for under the Llama Community License, and what risks Meta has flagged.

The cards are also a focal point for criticism. Watchers of open source AI note that while the cards list training-data buckets and aggregate statistics, they do not enumerate the specific datasets, web crawls, books, or licensed corpora used. This has led the Open Source Initiative to argue that Llama is not open source in the traditional sense.^[14]^[15] The cards sit at the intersection of two debates: how much information vendors should publish about foundation models, and what counts as "open" in the era of trillion-token training runs.

Background and framework

The model card concept comes from Mitchell et al.'s 2019 ACM FAccT paper Model Cards for Model Reporting, which proposed a structured document with roughly 30 disclosures across nine sections: Model Details, Intended Use, Factors, Metrics, Evaluation Data, Training Data, Quantitative Analyses, Ethical Considerations, and Caveats and Recommendations.^[9] Meta's first Llama card followed this template closely, with eight sections that fold Caveats and Recommendations into Ethical Considerations.^[2] Hugging Face later codified a similar template inside its huggingface_hub library, so any model uploaded to the Hub gets a default stub with the same headings.^[17]

Original Llama card (February 2023)

The first Llama card describes the original LLaMA release: an auto-regressive transformer language model in four sizes (7B, 13B, 33B, 65B), trained between December 2022 and February 2023 by Meta's FAIR team under a non-commercial bespoke license limited to approved researchers.^[2] The card frames the model as a research tool, points to the LLaMA, Open and Efficient Foundation Language Models paper by Touvron et al. for technical detail,^[1] and routes questions through the GitHub repo.

Intended use and factors

The primary intended use is research on large language models, including question answering, natural language understanding, reading comprehension, capability studies, and bias and toxicity evaluation. Primary users are researchers in natural language processing, machine learning, and artificial intelligence. The card states LLaMA is a base model that should not be deployed downstream without risk evaluation, since it has not been trained with human feedback and can generate toxic, offensive, incorrect, or unhelpful content. Language is the main performance factor: 20 languages appear in the training data but English dominates. Bias was evaluated on Responsible AI (RAI) datasets covering gender, religion, race, sexual orientation, age, nationality, disability, physical appearance, and socioeconomic status.^[2]

Metrics, datasets, and training data

Metrics include accuracy on common-sense reasoning, reading comprehension, and MMLU; exact match on question answering; and toxicity scores from the Perspective API. Only one model of each size was trained, so pretraining variability is not quantified. Benchmarks include BoolQ, PIQA, SIQA, HellaSwag, WinoGrande, ARC, OpenBookQA, NaturalQuestions, TriviaQA, RACE, MMLU, BIG-bench hard, GSM8K, RealToxicityPrompts, WinoGender, and CrowS-Pairs. The training mix is CCNet (67%), C4 (15%), GitHub (4.5%), Wikipedia (4.5%), Books (4.5%), ArXiv (2.5%), Stack Exchange (2%), with Wikipedia and Books covering 20 languages.^[1]^[2]

Quantitative analysis

Hyperparameters for the model architecture:

Parameters	Dimension	n heads	n layers	Learning rate	Batch size	n tokens
7B	4,096	32	32	3.0E-04	4M	1T^[1]
13B	5,120	40	40	3.0E-04	4M	1T
33B	6,656	52	60	1.5E-04	4M	1.4T
65B	8,192	64	80	1.5E-04	4M	1.4T

Results on eight standard common-sense reasoning benchmarks:

Parameters	BoolQ	PIQA	SIQA	HellaSwag	WinoGrande	ARC-e	ARC-c	OBQA	COPA
7B	76.5	79.8	48.9	76.1	70.1	76.7	47.6	57.2	93^[1]
13B	78.1	80.1	50.4	79.2	73.0	78.1	52.7	56.4	94
33B	83.1	82.3	50.4	82.8	76.0	81.4	57.8	58.6	92
65B	85.3	82.8	52.3	84.2	77.0	81.5	56.0	60.2	94

Results on bias evaluation (lower is better):

No	Category	FAIR LLM
1	Gender	70.6^[2]
2	Religion	79.0
3	Race/Color	57.0
4	Sexual orientation	81.0
5	Age	70.1
6	Nationality	64.2
7	Disability	66.7
8	Physical appearance	77.8
9	Socioeconomic status	71.5
	LLaMA average	66.6

Ethical considerations

The card flags training data as a source of bias and states the model is not intended to inform decisions about matters central to human life. Mitigations include filtering web data based on proximity to Wikipedia text and references, using a Kneser-Ney language model and a fastText linear classifier. The card warns about hallucinations and harmful generations, and lists misinformation generation as a fraught use case.^[2]

Llama 2 model card (July 2023)

Published on 18 July 2023, this card marks the first major scope change: Llama 2 shipped under a custom commercial license, so the card had to address commercial deployment.^[3] It documents three pretrained sizes (7B, 13B, 70B) and three corresponding Llama 2 Chat variants fine-tuned for dialogue with supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF).^[3]

Key numbers:

Field	Value
Sizes	7B, 13B, 70B
Architecture	Auto-regressive transformer with grouped-query attention on 70B
Pretraining tokens	2.0 trillion^[3]
Context length	4,096 tokens
Pretraining cutoff	September 2022
Fine-tuning data cutoff	up to July 2023
Training period	January 2023 through July 2023
Fine-tuning examples	over 1 million human-annotated examples
License	Llama 2 Community License (commercial use, with 700M MAU clause)

The card describes the pretraining mix only as "a new mix of publicly available online data" without naming datasets, and emphasizes that "neither the pretraining nor the fine-tuning datasets include Meta user data."^[3] This phrasing widens the gap between the named buckets in the Llama 1 card and the high-level description in Llama 2 and has been cited repeatedly in coverage and lawsuits. Llama 2 is also the first card with a Hardware and Software section: pretraining used Meta's Research Super Cluster and production clusters with NVIDIA A100-80GB GPUs at 350W to 400W TDP, totalling roughly 3.31 million GPU hours and 539 tCO2eq location-based emissions, which Meta says were 100% offset.^[3]

Llama 3 family cards (2024)

Llama 3 (18 April 2024) covers 8B and 70B pretrained and instruction-tuned variants with grouped-query attention, an 8,192-token context, a 128,000-token vocabulary, and pretraining on "over 15 trillion tokens of data from publicly available sources." The 8B has a March 2023 cutoff and the 70B December 2023, with 1.3M and 6.4M H100-80GB GPU hours and combined emissions of 2,290 tCO2eq location-based, 0 market-based.^[4]

The Llama 3.1 card (23 July 2024) adds the 405B variant. It records a 128k context, December 2023 cutoff, and 39.3M H100-80GB GPU hours total (1.46M / 7.0M / 30.84M). Emissions: 11,390 tCO2eq location-based, 0 market-based. Eight supported languages: English, German, French, Italian, Portuguese, Hindi, Spanish, Thai.^[5]

Llama 3.2 (24 October 2024) splits into a text card for the 1B (1.23B) and 3B (3.21B) on-device models and MODEL_CARD_VISION.md for the 11B and 90B vision models. The text card highlights quantized variants (SpinQuant and QLoRA) that cut memory 39% to 49% and double Android prefill speed, with up to 9 trillion training tokens.^[6]

Llama 3.3 (6 December 2024) documents a single 70B Instruct checkpoint with the same 15T mix, 128k context, and December 2023 cutoff as 3.1, but with fine-tuning gains: HumanEval 80.5 to 88.4 pass@1, MATH 68.0 to 77.0, IFEval 87.5 to 92.1, MGSM 86.9 to 91.1.^[7]

A quick comparison across the family:

Card	Release	Sizes	Context	Pretraining tokens	Knowledge cutoff
Llama 1	Feb 2023	7B, 13B, 33B, 65B	2,048	1.0T to 1.4T	early 2023^[2]
Llama 2	Jul 2023	7B, 13B, 70B	4,096	2.0T	Sep 2022^[3]
Llama 3	Apr 2024	8B, 70B	8,192	15T+	Mar 2023 / Dec 2023^[4]
Llama 3.1	Jul 2024	8B, 70B, 405B	128,000	15T+	Dec 2023^[5]
Llama 3.2	Oct 2024	1B, 3B, 11B-V, 90B-V	128,000	up to 9T	Dec 2023^[6]
Llama 3.3	Dec 2024	70B Instruct	128,000	~15T	Dec 2023^[7]
Llama 4	Apr 2025	Scout (17B/16E), Maverick (17B/128E)	10M / 1M	~22T to 40T	Aug 2024^[8]

Llama 4 model card (April 2025)

Dated 5 April 2025, the Llama 4 card documents two checkpoints: Scout (17B activated parameters across 16 experts, 109B total) and Maverick (17B activated parameters across 128 experts, 400B total).^[8] Both use a mixture-of-experts (MoE) architecture with "early fusion for native multimodality," interleaving image and text tokens at the input layer.^[8]

The card lists 12 supported languages (Arabic, English, French, German, Hindi, Indonesian, Italian, Portuguese, Spanish, Tagalog, Thai, Vietnamese), an August 2024 cutoff, and context windows of 10M tokens (Scout) and 1M (Maverick). Pretraining: ~40T tokens for Scout, ~22T for Maverick. Combined compute is 7.38M H100-80GB GPU hours and an estimated 1,999 tCO2eq location-based. Outputs are restricted to multilingual text and code; image generation is not supported.^[8] The card describes a third model, Behemoth (reported at 288B activated, ~2T total, 30T+ training tokens), as a codistillation teacher; as of mid-2026 Meta has not shipped a Behemoth card with public weights.^[8]^[10]

Standard sections across cards

Every Llama card keeps a recognizable skeleton:

Section	What it covers
Model details	Developer, version, dates, architecture, sizes, license, contact
Intended use	Primary uses, primary users, out-of-scope deployments
Hardware and software	Training infrastructure, GPU hours, energy, carbon emissions
Training data	Source mix, token count, knowledge cutoff, language coverage
Benchmarks and evaluation	Standard tasks, instruction-tuned scores, safety evaluations
Responsibility and safety	Llama Guard, Prompt Guard, Code Shield, red-teaming
Ethical considerations	Known biases, hallucinations, mitigation guidance
Citation	Suggested BibTeX entry

From Llama 3 onward the cards point to Meta's Purple Llama project, which contains companion safety models, and state that Llama "is not designed to be deployed in isolation."^[13]

License terms in the cards

From Llama 2 forward, every card links to a Llama Community License Agreement, permissive for commercial and research use with two notable exceptions: a 700-million-monthly-active-user threshold above which the licensee must request a separate license from Meta, and a clause prohibiting use of Llama outputs to train other LLMs.^[11] The cards also link to an Acceptable Use Policy that prohibits use for weapons development, CSAM, election manipulation, unauthorized practice of regulated professions, and circumvention of Meta's safety measures.^[12]

Reception and criticism

The cards have been welcomed for following the Mitchell et al. structure and publishing concrete numbers (GPU hours, emissions, benchmark scores) that many vendors keep internal. They have also drawn criticism. From Llama 2 onward, the cards describe the training mix as "publicly available online data" without listing specific datasets. The Open Source Initiative's 2024 Open Source AI Definition requires training data documentation; the OSI states that Llama is not open source by that standard.^[15] The 2024 Kadrey v. Meta lawsuit alleges Meta used a copy of the LibGen shadow library and stripped copyright headers, an accusation the cards do not address. The Free Software Foundation and several legal commentators argue that the Llama Community License is source-available, not an open source license, because of the 700M MAU and competitor-restriction clauses.^[14]

At Llama 4 launch, The Verge and TechCrunch reported that the version submitted to LMSYS Chatbot Arena was an "experimental chat" build optimized for human preference scoring, not the released weights, putting cards as verification documents under scrutiny. The cards have nonetheless been used as templates by other labs (Mistral, DeepSeek, and several Chinese open-weight projects ship cards in the Llama format) and function as a de facto baseline for frontier model disclosures. Each card sits inside a broader stack of research paper, responsible use guide, acceptable use policy, community license, and the Purple Llama repository. Hugging Face also generates a derived card on each repository (for example, meta-llama/Llama-3.1-8B-Instruct) that copies the GitHub card and adds platform metadata; these derived cards are what most developers see in practice.^[17]

References

Touvron et al. *LLaMA: Open and Efficient Foundation Language Models*. Meta AI, Feb 2023. https://research.facebook.com/publications/llama-open-and-efficient-foundation-language-models/ ↩
Meta AI. *LLaMA Model Card (v1)*. https://github.com/meta-llama/llama/blob/llama_v1/MODEL_CARD.md ↩
Meta AI. *Llama 2 Model Card*. https://github.com/meta-llama/llama-models/blob/main/models/llama2/MODEL_CARD.md ↩
Meta AI. *Meta Llama 3 Model Card*. https://github.com/meta-llama/llama3/blob/main/MODEL_CARD.md ↩
Meta AI. *Llama 3.1 Model Card*. https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md ↩
Meta AI. *Llama 3.2 Model Card*. https://github.com/meta-llama/llama-models/blob/main/models/llama3_2/MODEL_CARD.md ↩
Meta AI. *Llama 3.3 Model Card*. https://github.com/meta-llama/llama-models/blob/main/models/llama3_3/MODEL_CARD.md ↩
Meta AI. *Llama 4 Model Card*. https://github.com/meta-llama/llama-models/blob/main/models/llama4/MODEL_CARD.md ↩
Mitchell et al. *Model Cards for Model Reporting*. FAT* 2019. https://arxiv.org/abs/1810.03993 ↩
Meta AI. *The Llama 4 herd*. April 2025. https://ai.meta.com/blog/llama-4-multimodal-intelligence/ ↩
Meta AI. *Llama Community License Agreements (2, 3.1, 4)*. https://www.llama.com/llama4/license/ ↩
Meta AI. *Llama Acceptable Use Policy*. https://www.llama.com/use-policy/ ↩
Meta AI. *Purple Llama*. https://github.com/meta-llama/PurpleLlama ↩
OpenSource Connections. *Is Llama 2 open source? No.* July 2023. https://opensourceconnections.com/blog/2023/07/19/is-llama-2-open-source-no-and-perhaps-we-need-a-new-definition-of-open/ ↩
Open Source Initiative. *The Open Source AI Definition (OSAID) 1.0*. Oct 2024. https://opensource.org/ai ↩
Wikipedia. *Llama (language model)*. https://en.wikipedia.org/wiki/Llama_(language_model)
Hugging Face. *meta-llama organization*. https://huggingface.co/meta-llama ↩

Improve this article

Add missing citations, update stale details, or suggest a clearer explanation. Every suggestion is reviewed for sourcing before it goes live.

2 revisions by 1 contributors · full history

Suggest edit

What links here

LLaMA Llama 3.1

Background and framework

Original Llama card (February 2023)

Intended use and factors

Metrics, datasets, and training data

Quantitative analysis

Ethical considerations

Llama 2 model card (July 2023)

Llama 3 family cards (2024)

Llama 4 model card (April 2025)

Standard sections across cards

License terms in the cards

Reception and criticism

See also

References

Improve this article

Related Articles

Llama 3

Llama 3.2

Llama 3.3

Llama 3.1

Llama 4 Scout and Maverick

Llama 4 Behemoth

What links here

Related Articles

Llama 3

Llama 3.2

Llama 3.3

Llama 3.1

Llama 4 Scout and Maverick

Llama 4 Behemoth

What links here