LLM Comparisons: Difference between revisions

From AI Wiki
No edit summary
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
Compare different [[large language models]] ([[LLM]]s): [[#Concise Comparison|Concise comparison]], [[#Detailed Comparison|detailed comparison]], [[#Terms|terminology definitions]].
{{see also|LLM Benchmarks Timeline|LLM Rankings}}
__TOC__
Compare different [[large language models]] ([[LLM]]s).
==Concised Comparison==
 
{| class="wikitable sortable"
{| class="wikitable sortable"
! Model
! Model
Line 11: Line 11:
! Median<br>(First Chunk (s))
! Median<br>(First Chunk (s))
|-
|-
| '''[[o1-preview]]'''
| '''[[o1-preview]]''' || [[OpenAI]] || 128k || 86 || $27.56 || 143.7 || 21.33
| [[OpenAI]]
| 128k
| 86
| $27.56
| 144.5
| 21.12
|-
|-
| '''[[o1-mini]]'''
| '''[[o1-mini]]''' || [[OpenAI]] || 128k || 84 || $5.25 || 213.2 || 11.27
| [[OpenAI]]
| 128k
| 84
| $5.25
| 213.2
| 11.64
|-
|-
| '''[[GPT-4o (Aug '24)]]'''
| '''[[GPT-4o (Aug '24)]]''' || [[OpenAI]] || 128k || 78 || $4.38 || 83.5 || 0.67
| [[OpenAI]]
| 128k
| 78
| $4.38
| 85.6
| 0.65
|-
|-
| '''[[GPT-4o (May '24)]]'''
| '''[[GPT-4o (May '24)]]''' || [[OpenAI]] || 128k || 78 || $7.50 || 106.3 || 0.65
| [[OpenAI]]
| 128k
| 78
| $7.50
| 106.6
| 0.65
|-
|-
| '''[[GPT-4o mini]]'''
| '''[[GPT-4o mini]]''' || [[OpenAI]] || 128k || 73 || $0.26 || 113.8 || 0.64
| [[OpenAI]]
| 128k
| 73
| $0.26
| 121.8
| 0.65
|-
|-
| '''[[GPT-4o (Nov '24)]]'''
| '''[[GPT-4o (Nov '24)]]''' || [[OpenAI]] || 128k || 73 || $4.38 || 116.4 || 0.39
| [[OpenAI]]
| 128k
| 73
| $4.38
| 115.7
| 0.38
|-
|-
| '''[[GPT-4o mini Realtime (Dec '24)]]'''
| '''[[GPT-4o mini Realtime (Dec '24)]]''' || [[OpenAI]] || 128k ||  || $0.00 || ||
| [[OpenAI]]
| 128k
|
| $0.00
|
|
|-
|-
| '''[[GPT-4o Realtime (Dec '24)]]'''
| '''[[GPT-4o Realtime (Dec '24)]]''' || [[OpenAI]] || 128k ||  || $0.00 || ||
| [[OpenAI]]
| 128k
|
| $0.00
|
|
|-
|-
| '''[[Llama 3.3 70B]]'''
| '''[[Llama 3.3 70B]]''' || [[Meta]] || 128k || 74 || $0.69 || 71.8 || 0.49
| [[Meta]]
| 128k
| 74
| $0.69
| 71.3
| 0.50
|-
|-
| '''[[Llama 3.1 405B]]'''
| '''[[Llama 3.1 405B]]''' || [[Meta]] || 128k || 74 || $3.50 || 30.2 || 0.71
| [[Meta]]
| 128k
| 74
| $3.50
| 30.2
| 0.72
|-
|-
| '''[[Llama 3.1 70B]]'''
| '''[[Llama 3.1 70B]]''' || [[Meta]] || 128k || 68 || $0.72 || 72.8 || 0.44
| [[Meta]]
| 128k
| 68
| $0.72
| 72.7
| 0.44
|-
|-
| '''[[Llama 3.2 90B (Vision)]]'''
| '''[[Llama 3.2 90B (Vision)]]''' || [[Meta]] || 128k || 68 || $0.81 || 48.9 || 0.33
| [[Meta]]
| 128k
| 68
| $0.81
| 48.5
| 0.34
|-
|-
| '''[[Llama 3.2 11B (Vision)]]'''
| '''[[Llama 3.2 11B (Vision)]]''' || [[Meta]] || 128k || 54 || $0.18 || 131.2 || 0.28
| [[Meta]]
| 128k
| 54
| $0.18
| 131.0
| 0.29
|-
|-
| '''[[Llama 3.1 8B]]'''
| '''[[Llama 3.1 8B]]''' || [[Meta]] || 128k || 54 || $0.10 || 184.9 || 0.33
| [[Meta]]
| 128k
| 54
| $0.10
| 184.7
| 0.34
|-
|-
| '''[[Llama 3.2 3B]]'''
| '''[[Llama 3.2 3B]]''' || [[Meta]] || 128k || 49 || $0.06 || 201.4 || 0.38
| [[Meta]]
| 128k
| 49
| $0.06
| 202.1
| 0.38
|-
|-
| '''[[Llama 3.2 1B]]'''
| '''[[Llama 3.2 1B]]''' || [[Meta]] || 128k || 26 || $0.04 || 468.6 || 0.37
| [[Meta]]
| 128k
| 26
| $0.04
| 468.9
| 0.38
|-
|-
| '''[[Gemini 2.0 Flash (exp)]]'''
| '''[[Gemini 2.0 Flash (exp)]]''' || [[Google]] || 1m || 82 || $0.00 || 169.0 || 0.48
| [[Google]]
| 1m
| 82
| $0.00
| 169.0
| 0.48
|-
|-
| '''[[Gemini 1.5 Pro (Sep)]]'''
| '''[[Gemini 1.5 Pro (Sep)]]''' || [[Google]] || 2m || 80 || $2.19 || 60.8 || 0.74
| [[Google]]
| 2m
| 80
| $2.19
| 60.9
| 0.74
|-
|-
| '''[[Gemini 1.5 Flash (Sep)]]'''
| '''[[Gemini 1.5 Flash (Sep)]]''' || [[Google]] || 1m || 72 || $0.13 || 188.4 || 0.25
| [[Google]]
| 1m
| 72
| $0.13
| 188.6
| 0.25
|-
|-
| '''[[Gemma 2 27B]]'''
| '''[[Gemma 2 27B]]''' || [[Google]] || 8k || 61 || $0.26 || 59.4 || 0.48
| [[Google]]
| 8k
| 61
| $0.26
| 58.9
| 0.51
|-
|-
| '''[[Gemma 2 9B]]'''
| '''[[Gemma 2 9B]]''' || [[Google]] || 8k || 55 || $0.12 || 168.9 || 0.36
| [[Google]]
| 8k
| 55
| $0.12
| 168.9
| 0.37
|-
|-
| '''[[Gemini 1.5 Flash (May)]]'''
| '''[[Gemini 1.5 Flash (May)]]''' || [[Google]] || 1m ||  || $0.13 || 310.6 || 0.29
| [[Google]]
| 1m
|
| $0.13
| 310.4
| 0.29
|-
|-
| '''[[Gemini Experimental (Nov)]]'''
| '''[[Gemini Experimental (Nov)]]''' || [[Google]] || 2m ||  || $0.00 || 53.9 || 1.12
| [[Google]]
| 2m
|
| $0.00
| 53.9
| 1.11
|-
|-
| '''[[Gemini 1.5 Pro (May)]]'''
| '''[[Gemini 1.5 Pro (May)]]''' || [[Google]] || 2m ||  || $2.19 || 66.9 || 0.49
| [[Google]]
| 2m
|
| $2.19
| 66.9
| 0.50
|-
|-
| '''[[Gemini 1.5 Flash-8B]]'''
| '''[[Gemini 1.5 Flash-8B]]''' || [[Google]] || 1m ||  || $0.07 || 279.7 || 0.38
| [[Google]]
| 1m
|
| $0.07
| 278.8
| 0.39
|-
|-
| '''[[Claude 3.5 Sonnet (Oct)]]'''
| '''[[Claude 3.5 Sonnet (Oct)]]''' || [[Anthropic]] || 200k || 80 || $6.00 || 72.0 || 0.99
| [[Anthropic]]
| 200k
| 80
| $6.00
| 71.9
| 0.98
|-
|-
| '''[[Claude 3.5 Sonnet (June)]]'''
| '''[[Claude 3.5 Sonnet (June)]]''' || [[Anthropic]] || 200k || 76 || $6.00 || 61.5 || 0.87
| [[Anthropic]]
| 200k
| 76
| $6.00
| 61.4
| 0.87
|-
|-
| '''[[Claude 3 Opus]]'''
| '''[[Claude 3 Opus]]''' || [[Anthropic]] || 200k || 70 || $30.00 || 25.9 || 2.00
| [[Anthropic]]
| 200k
| 70
| $30.00
| 25.9
| 2.00
|-
|-
| '''[[Claude 3.5 Haiku]]'''
| '''[[Claude 3.5 Haiku]]''' || [[Anthropic]] || 200k || 68 || $1.60 || 65.1 || 0.71
| [[Anthropic]]
| 200k
| 68
| $1.60
| 65.1
| 0.71
|-
|-
| '''[[Claude 3 Haiku]]'''
| '''[[Claude 3 Haiku]]''' || [[Anthropic]] || 200k || 55 || $0.50 || 121.6 || 0.72
| [[Anthropic]]
| 200k
| 55
| $0.50
| 122.7
| 0.72
|-
|-
| '''[[Pixtral Large]]'''
| '''[[Pixtral Large]]''' || [[Mistral]] || 128k || 74 || $3.00 || 36.5 || 0.40
| [[Mistral]]
| 128k
| 74
| $3.00
| 36.7
| 0.39
|-
|-
| '''[[Mistral Large 2 (Jul '24)]]'''
| '''[[Mistral Large 2 (Jul '24)]]''' || [[Mistral]] || 128k || 74 || $3.00 || 31.1 || 0.50
| [[Mistral]]
| 128k
| 74
| $3.00
| 31.2
| 0.50
|-
|-
| '''[[Mistral Large 2 (Nov '24)]]'''
| '''[[Mistral Large 2 (Nov '24)]]''' || [[Mistral]] || 128k || 74 || $3.00 || 37.4 || 0.52
| [[Mistral]]
| 128k
| 74
| $3.00
| 37.4
| 0.52
|-
|-
| '''[[Mistral Small (Sep '24)]]'''
| '''[[Mistral Small (Sep '24)]]''' || [[Mistral]] || 33k || 61 || $0.30 || 61.5 || 0.32
| [[Mistral]]
| 33k
| 61
| $0.30
| 61.5
| 0.32
|-
|-
| '''[[Mixtral 8x22B]]'''
| '''[[Mixtral 8x22B]]''' || [[Mistral]] || 65k || 61 || $1.20 || 85.1 || 0.57
| [[Mistral]]
| 65k
| 61
| $1.20
| 85.5
| 0.57
|-
|-
| '''[[Pixtral 12B]]'''
| '''[[Pixtral 12B]]''' || [[Mistral]] || 128k || 56 || $0.13 || 70.3 || 0.37
| [[Mistral]]
| 128k
| 56
| $0.13
| 70.5
| 0.37
|-
|-
| '''[[Ministral 8B]]'''
| '''[[Ministral 8B]]''' || [[Mistral]] || 128k || 56 || $0.10 || 136.1 || 0.30
| [[Mistral]]
| 128k
| 56
| $0.10
| 135.8
| 0.29
|-
|-
| '''[[Mistral NeMo]]'''
| '''[[Mistral NeMo]]''' || [[Mistral]] || 128k || 54 || $0.09 || 122.5 || 0.48
| [[Mistral]]
| 128k
| 54
| $0.09
| 123.7
| 0.48
|-
|-
| '''[[Ministral 3B]]'''
| '''[[Ministral 3B]]''' || [[Mistral]] || 128k || 53 || $0.04 || 168.5 || 0.29
| [[Mistral]]
| 128k
| 53
| $0.04
| 168.1
| 0.29
|-
|-
| '''[[Mixtral 8x7B]]'''
| '''[[Mixtral 8x7B]]''' || [[Mistral]] || 33k || 41 || $0.50 || 110.6 || 0.36
| [[Mistral]]
| 33k
| 41
| $0.50
| 110.1
| 0.36
|-
|-
| '''[[Codestral-Mamba]]'''
| '''[[Codestral-Mamba]]''' || [[Mistral]] || 256k || 33 || $0.25 || 95.8 || 0.44
| [[Mistral]]
| 256k
| 33
| $0.25
| 95.8
| 0.44
|-
|-
| '''[[Command-R+]]'''
| '''[[Command-R+]]''' || [[Cohere]] || 128k || 55 || $5.19 || 50.7 || 0.47
| [[Cohere]]
| 128k
| 55
| $5.19
| 50.7
| 0.47
|-
|-
| '''[[Command-R+ (Apr '24)]]'''
| '''[[Command-R+ (Apr '24)]]''' || [[Cohere]] || 128k || 45 || $6.00 || 49.3 || 0.51
| [[Cohere]]
| 128k
| 45
| $6.00
| 49.3
| 0.51
|-
|-
| '''[[Command-R (Mar '24)]]'''
| '''[[Command-R (Mar '24)]]''' || [[Cohere]] || 128k || 36 || $0.75 || 108.1 || 0.36
| [[Cohere]]
| 128k
| 36
| $0.75
| 108.3
| 0.36
|-
|-
| '''[[Aya Expanse 8B]]'''
| '''[[Aya Expanse 8B]]''' || [[Cohere]] || 8k ||  || $0.75 || 165.5 || 0.16
| [[Cohere]]
| 8k
|
| $0.75
| 165.6
| 0.16
|-
|-
| '''[[Command-R]]'''
| '''[[Command-R]]''' || [[Cohere]] || 128k ||  || $0.51 || 111.8 || 0.32
| [[Cohere]]
| 128k
|
| $0.51
| 111.7
| 0.32
|-
|-
| '''[[Aya Expanse 32B]]'''
| '''[[Aya Expanse 32B]]''' || [[Cohere]] || 128k ||  || $0.75 || 120.4 || 0.18
| [[Cohere]]
| 128k
|
| $0.75
| 120.3
| 0.18
|-
|-
| '''[[Sonar 3.1 Small]]'''
| '''[[Sonar 3.1 Small]]''' || [[Perplexity]] || 127k ||  || $0.20 || 203.8 || 0.34
| [[Perplexity]]
| 127k
|
| $0.20
| 203.8
| 0.35
|-
|-
| '''[[Sonar 3.1 Large]]'''
| '''[[Sonar 3.1 Large]]''' || [[Perplexity]] || 127k ||  || $1.00 || 57.7 || 0.31
| [[Perplexity]]
| 127k
|
| $1.00
| 57.8
| 0.31
|-
|-
| '''[[Grok Beta]]'''
| '''[[Grok Beta]]''' || [[xAI]] || 128k || 72 || $7.50 || 66.7 || 0.42
| [[xAI]]
| 128k
| 72
| $7.50
| 66.5
| 0.42
|-
|-
| '''[[Nova Pro]]'''
| '''[[Nova Pro]]''' || [[Amazon]] || 300k || 75 || $1.40 || 91.0 || 0.38
| [[Amazon]]
| 300k
| 75
| $1.40
| 91.1
| 0.38
|-
|-
| '''[[Nova Lite]]'''
| '''[[Nova Lite]]''' || [[Amazon]] || 300k || 70 || $0.10 || 148.0 || 0.33
| [[Amazon]]
| 300k
| 70
| $0.10
| 147.7
| 0.34
|-
|-
| '''[[Nova Micro]]'''
| '''[[Nova Micro]]''' || [[Amazon]] || 130k || 66 || $0.06 || 195.5 || 0.33
| [[Amazon]]
| 130k
| 66
| $0.06
| 195.3
| 0.33
|-
|-
| '''[[Phi-4]]'''
| '''[[Phi-4]]''' || [[Microsoft Azure]] || 16k || 77 || $0.09 || 85.0 || 0.22
| [[Microsoft Azure]]
| 16k
| 77
| $0.09
| 85.1
| 0.21
|-
|-
| '''[[Phi-3 Mini]]'''
| '''[[Phi-3 Mini]]''' || [[Microsoft Azure]] || 4k ||  || $0.00 || ||
| [[Microsoft Azure]]
| 4k
|
| $0.00
|
|
|-
|-
| '''[[Phi-3 Medium 14B]]'''
| '''[[Phi-3 Medium 14B]]''' || [[Microsoft Azure]] || 128k ||  || $0.30 || 50.4 || 0.43
| [[Microsoft Azure]]
| 128k
|
| $0.30
| 50.7
| 0.43
|-
|-
| '''[[Solar Mini]]'''
| '''[[Solar Mini]]''' || [[Upstage]] || 4k || 47 || $0.15 || 89.3 || 1.13
| [[Upstage]]
| 4k
| 47
| $0.15
| 89.3
| 1.12
|-
|-
| '''[[DBRX]]'''
| '''[[DBRX]]''' || [[Databricks]] || 33k || 46 || $1.16 || 78.3 || 0.42
| [[Databricks]]
| 33k
| 46
| $1.16
| 74.2
| 0.43
|-
|-
| '''[[Llama 3.1 Nemotron 70B]]'''
| '''[[Llama 3.1 Nemotron 70B]]''' || [[NVIDIA]] || 128k || 72 || $0.27 || 48.3 || 0.57
| [[NVIDIA]]
| 128k
| 72
| $0.27
| 48.3
| 0.57
|-
|-
| '''[[Reka Flash]]'''
| '''[[Reka Flash]]''' || [[Reka AI]] || 128k || 59 || $0.35 || ||
| [[Reka AI]]
| 128k
| 59
| $0.35
|
|
|-
|-
| '''[[Reka Core]]'''
| '''[[Reka Core]]''' || [[Reka AI]] || 128k || 58 || $2.00 || ||
| [[Reka AI]]
| 128k
| 58
| $2.00
|
|
|-
|-
| '''[[Reka Flash (Feb '24)]]'''
| '''[[Reka Flash (Feb '24)]]''' || [[Reka AI]] || 128k || 46 || $0.35 || ||
| [[Reka AI]]
| 128k
| 46
| $0.35
|
|
|-
|-
| '''[[Reka Edge]]'''
| '''[[Reka Edge]]''' || [[Reka AI]] || 128k || 31 || $0.10 || ||
| [[Reka AI]]
| 128k
| 31
| $0.10
|
|
|-
|-
| '''[[Jamba 1.5 Large]]'''
| '''[[Jamba 1.5 Large]]''' || [[AI21 Labs]] || 256k || 64 || $3.50 || 51.0 || 0.71
| [[AI21 Labs]]
| 256k
| 64
| $3.50
| 50.8
| 0.71
|-
|-
| '''[[Jamba 1.5 Mini]]'''
| '''[[Jamba 1.5 Mini]]''' || [[AI21 Labs]] || 256k ||  || $0.25 || 83.7 || 0.48
| [[AI21 Labs]]
| 256k
|
| $0.25
| 83.5
| 0.48
|-
|-
| '''[[DeepSeek V3]]'''
| '''[[DeepSeek V3]]''' || [[DeepSeek]] || 128k || 80 || $0.90 || 20.9 || 0.94
| [[DeepSeek]]
| 128k
| 80
| $0.90
| 21.1
| 0.95
|-
|-
| '''[[DeepSeek-V2.5 (Dec '24)]]'''
| '''[[DeepSeek-V2.5 (Dec '24)]]''' || [[DeepSeek]] || 128k || 72 || $0.17 || 61.8 || 1.15
| [[DeepSeek]]
| 128k
| 72
| $0.17
| 63.8
| 1.15
|-
|-
| '''[[DeepSeek-Coder-V2]]'''
| '''[[DeepSeek-Coder-V2]]''' || [[DeepSeek]] || 128k || 71 || $0.17 || 62.0 || 1.11
| [[DeepSeek]]
| 128k
| 71
| $0.17
| 63.8
| 1.12
|-
|-
| '''[[DeepSeek-V2.5]]'''
| '''[[DeepSeek-V2.5]]''' || [[DeepSeek]] || 128k ||  || $1.09 || 7.6 || 0.77
| [[DeepSeek]]
| 128k
|
| $1.09
| 7.6
| 0.77
|-
|-
| '''[[DeepSeek-V2]]'''
| '''[[DeepSeek-V2]]''' || [[DeepSeek]] || 128k ||  || $0.17 || ||
| [[DeepSeek]]
| 128k
|
| $0.17
|
|
|-
|-
| '''[[Arctic]]'''
| '''[[Arctic]]''' || [[Snowflake]] || 4k || 51 || $0.00 || ||
| [[Snowflake]]
| 4k
| 51
| $0.00
|
|
|-
|-
| '''[[Qwen2.5 72B]]'''
| '''[[Qwen2.5 72B]]''' || [[Alibaba]] || 131k || 77 || $0.40 || 67.6 || 0.53
| [[Alibaba]]
| 131k
| 77
| $0.40
| 67.6
| 0.53
|-
|-
| '''[[Qwen2.5 Coder 32B]]'''
| '''[[Qwen2.5 Coder 32B]]''' || [[Alibaba]] || 131k || 72 || $0.80 || 84.0 || 0.38
| [[Alibaba]]
| 131k
| 72
| $0.80
| 84.1
| 0.38
|-
|-
| '''[[Qwen2 72B]]'''
| '''[[Qwen2 72B]]''' || [[Alibaba]] || 131k || 72 || $0.63 || 46.5 || 0.30
| [[Alibaba]]
| 131k
| 72
| $0.63
| 45.7
| 0.30
|-
|-
| '''[[QwQ 32B-Preview]]'''
| '''[[QwQ 32B-Preview]]''' || [[Alibaba]] || 33k || 46 || $0.26 || 67.3 || 0.40
| [[Alibaba]]
| 33k
| 46
| $0.26
| 66.9
| 0.40
|-
|-
| '''[[Yi-Large]]'''
| '''[[Yi-Large]]''' || [[01.AI]] || 32k || 61 || $3.00 || 68.1 || 0.47
| [[01.AI]]
| 32k
| 61
| $3.00
| 67.5
| 0.47
|-
|-
| '''[[GPT-4 Turbo]]'''
| '''[[GPT-4 Turbo]]''' || [[OpenAI]] || 128k || 75 || $15.00 || 43.3 || 1.20
| [[OpenAI]]
| 128k
| 75
| $15.00
| 43.8
| 1.19
|-
|-
| '''[[GPT-4]]'''
| '''[[GPT-4]]''' || [[OpenAI]] || 8k ||  || $37.50 || 28.4 || 0.75
| [[OpenAI]]
| 8k
|
| $37.50
| 30.5
| 0.73
|-
|-
| '''[[Llama 3 70B]]'''
| '''[[Llama 3 70B]]''' || [[Meta]] || 8k || 48 || $0.89 || 48.9 || 0.38
| [[Meta]]
| 8k
| 48
| $0.89
| 48.9
| 0.38
|-
|-
| '''[[Llama 3 8B]]'''
| '''[[Llama 3 8B]]''' || [[Meta]] || 8k || 45 || $0.15 || 117.3 || 0.34
| [[Meta]]
| 8k
| 45
| $0.15
| 118.9
| 0.34
|-
|-
| '''[[Llama 2 Chat 70B]]'''
| '''[[Llama 2 Chat 70B]]''' || [[Meta]] || 4k ||  || $1.85 || ||
| [[Meta]]
| 4k
|
| $1.85
|
|
|-
|-
| '''[[Llama 2 Chat 13B]]'''
| '''[[Llama 2 Chat 13B]]''' || [[Meta]] || 4k ||  || $0.00 || ||
| [[Meta]]
| 4k
|
| $0.00
|
|
|-
|-
| '''[[Llama 2 Chat 7B]]'''
| '''[[Llama 2 Chat 7B]]''' || [[Meta]] || 4k ||  || $0.33 || 123.7 || 0.37
| [[Meta]]
| 4k
|
| $0.33
| 123.8
| 0.37
|-
|-
| '''[[Gemini 1.0 Pro]]'''
| '''[[Gemini 1.0 Pro]]''' || [[Google]] || 33k ||  || $0.75 || 102.9 || 1.27
| [[Google]]
| 33k
|
| $0.75
| 103.0
| 1.28
|-
|-
| '''[[Claude 3 Sonnet]]'''
| '''[[Claude 3 Sonnet]]''' || [[Anthropic]] || 200k || 57 || $6.00 || 68.2 || 0.76
| [[Anthropic]]
| 200k
| 57
| $6.00
| 67.9
| 0.74
|-
|-
| '''[[Claude 2.1]]'''
| '''[[Claude 2.1]]''' || [[Anthropic]] || 200k ||  || $12.00 || 14.1 || 1.24
| [[Anthropic]]
| 200k
|
| $12.00
| 20.3
| 1.41
|-
|-
| '''[[Claude 2.0]]'''
| '''[[Claude 2.0]]''' || [[Anthropic]] || 100k ||  || $12.00 || 29.9 || 0.81
| [[Anthropic]]
| 100k
|
| $12.00
| 29.9
| 0.81
|-
|-
| '''[[Mistral Small (Feb '24)]]'''
| '''[[Mistral Small (Feb '24)]]''' || [[Mistral]] || 33k || 59 || $1.50 || 53.5 || 0.37
| [[Mistral]]
| 33k
| 59
| $1.50
| 53.7
| 0.38
|-
|-
| '''[[Mistral Large (Feb '24)]]'''
| '''[[Mistral Large (Feb '24)]]''' || [[Mistral]] || 33k || 56 || $6.00 || 38.9 || 0.44
| [[Mistral]]
| 33k
| 56
| $6.00
| 38.9
| 0.43
|-
|-
| '''[[Mistral 7B]]'''
| '''[[Mistral 7B]]''' || [[Mistral]] || 8k || 28 || $0.16 || 112.5 || 0.26
| [[Mistral]]
| 8k
| 28
| $0.16
| 113.1
| 0.26
|-
|-
| '''[[Mistral Medium]]'''
| '''[[Mistral Medium]]''' || [[Mistral]] || 33k ||  || $4.09 || 44.7 || 0.37
| [[Mistral]]
| 33k
|
| $4.09
| 44.5
| 0.38
|-
|-
| '''[[Codestral]]'''
| '''[[Codestral]]''' || [[Mistral]] || 33k ||  || $0.30 || 84.9 || 0.28
| [[Mistral]]
| 33k
|
| $0.30
| 84.9
| 0.28
|-
|-
| '''[[OpenChat 3.5]]'''
| '''[[OpenChat 3.5]]''' || [[OpenChat]] || 8k || 44 || $0.06 || 73.5 || 0.30
| [[OpenChat]]
| 8k
| 44
| $0.06
| 73.4
| 0.30
|-
|-
| '''[[Jamba Instruct]]'''
| '''[[Jamba Instruct]]''' || [[AI21 Labs]] || 256k || || $0.55 || 77.4 || 0.52
| [[AI21 Labs]]
| 256k
| –
| $0.55
| 77.3
| 0.52
|}
 
==Detailed Comparison==
{| class="wikitable sortable"
! Model
! Creator
! License
! Context Window
! Quality Index
! Chatbot Arena
! MMLU
! GPQA
! MATH-500
! HumanEval
! Blended
! USD/1M Tokens
! Input Price
! Output Price
! Median Tokens/s
! P5 Tokens/s
! P25 Tokens/s
! P75 Tokens/s
! P95 Tokens/s
! Median First Chunk (s)
! P5 First Chunk (s)
! P25 First Chunk (s)
! P75 First Chunk (s)
! P95 First Chunk (s)
! Further Analysis
|-
| '''[[o1-preview]]'''
| OpenAI logo
| Proprietary
| 128k
| 86
| 1334
| 0.91
| 0.67
| 0.92
| 0.96
| $27.56
| $15.75
| $63.00
| 143.8
| 68.9
| 121.6
| 164.6
| 179.6
| 21.28
| 13.40
| 17.04
| 27.80
| 46.49
| –
| –
| Model Providers
|-
| '''[[o1-mini]]'''
| OpenAI logo
| Proprietary
| 128k
| 84
| 1308
| 0.85
| 0.58
| 0.95
| 0.97
| $5.25
| $3.00
| $12.00
| 213.6
| 84.0
| 154.8
| 238.0
| 299.4
| 11.75
| 2.44
| 9.40
| 14.43
| 24.03
| –
| –
| Model Providers
|-
| '''[[GPT-4o (Aug '24)]]'''
| OpenAI logo
| Proprietary
| 128k
| 78
| 1337
| 0.89
| 0.51
| 0.80
| 0.93
| –
| –
| $4.38
| $2.50
| $10.00
| 85.6
| 40.3
| 61.5
| 109.3
| 143.6
| 0.66
| 0.33
| 0.43
| 0.91
| 1.92
| Model Providers
|-
| '''[[GPT-4o (May '24)]]'''
| OpenAI logo
| Proprietary
| 128k
| 78
| 1285
| 0.87
| 0.51
| 0.79
| 0.93
| –
| –
| $7.50
| $5.00
| $15.00
| 106.8
| 53.2
| 82.2
| 126.8
| 142.5
| 0.65
| 0.32
| 0.43
| 0.73
| 1.22
| Model Providers
|-
| '''[[GPT-4o mini]]'''
| OpenAI logo
| Proprietary
| 128k
| 73
| 1273
| 0.82
| 0.44
| 0.79
| 0.88
| –
| –
| $0.26
| $0.15
| $0.60
| 121.8
| 50.7
| 74.1
| 179.4
| 206.5
| 0.65
| 0.30
| 0.39
| 0.77
| 0.92
| Model Providers
|-
| '''[[GPT-4o (Nov '24)]]'''
| OpenAI logo
| Proprietary
| 128k
| 73
| 1361
| 0.86
| 0.39
| 0.74
| 0.93
| –
| –
| $4.38
| $2.50
| $10.00
| 115.1
| 71.7
| 95.0
| 140.0
| 165.5
| 0.38
| 0.27
| 0.32
| 0.52
| 0.75
| Model Providers
|-
| '''[[GPT-4o mini Realtime (Dec '24)]]'''
| OpenAI logo
| Proprietary
| 128k
| –
| –
| –
| –
| –
| –
| –
| –
| $0.00
| $0.00
| $0.00
| –
| –
| –
| –
| –
| –
| –
| –
| –
| –
| Model Providers
|-
| '''[[GPT-4o Realtime (Dec '24)]]'''
| OpenAI logo
| Proprietary
| 128k
| –
| –
| –
| –
| –
| –
| –
| –
| $0.00
| $0.00
| $0.00
| –
| –
| –
| –
| –
| –
| –
| –
| –
| –
| Model Providers
|-
| '''[[Llama 3.3 70B]]'''
| Meta logo
| Open
| 128k
| 74
| –
| 0.86
| 0.49
| 0.76
| 0.86
| –
| –
| $0.67
| $0.59
| $0.73
| 67.2
| 23.6
| 31.2
| 275.7
| 2046.5
| 0.51
| 0.23
| 0.36
| 0.72
| 1.48
| Model Providers
|-
| '''[[Llama 3.1 405B]]'''
| Meta logo
| Open
| 128k
| 74
| 1266
| 0.87
| 0.50
| 0.70
| 0.87
| –
| –
| $3.50
| $3.50
| $3.50
| 30.2
| 12.2
| 21.8
| 66.3
| 160.3
| 0.72
| 0.37
| 0.52
| 0.89
| 2.02
| Model Providers
|-
| '''[[Llama 3.1 70B]]'''
| Meta logo
| Open
| 128k
| 68
| 1249
| 0.84
| 0.43
| 0.64
| 0.80
| –
| –
| $0.70
| $0.60
| $0.80
| 72.6
| 27.9
| 42.2
| 218.5
| 1971.9
| 0.44
| 0.24
| 0.34
| 0.60
| 1.00
| Model Providers
|-
| '''[[Llama 3.2 90B (Vision)]]'''
| Meta logo
| Open
| 128k
| 68
| –
| 0.84
| 0.42
| 0.65
| 0.80
| –
| –
| $0.81
| $0.81
| $0.81
| 48.6
| 30.4
| 35.9
| 66.6
| 272.3
| 0.34
| 0.18
| 0.26
| 0.47
| 0.61
| Model Providers
|-
| '''[[Llama 3.2 11B (Vision)]]'''
| Meta logo
| Open
| 128k
| 54
| –
| 0.71
| 0.25
| 0.50
| 0.69
| –
| –
| $0.18
| $0.18
| $0.18
| 131.0
| 35.9
| 118.2
| 165.4
| 752.4
| 0.29
| 0.17
| 0.23
| 0.36
| 0.56
| Model Providers
|-
| '''[[Llama 3.1 8B]]'''
| Meta logo
| Open
| 128k
| 54
| 1172
| 0.71
| 0.26
| 0.50
| 0.68
| –
| –
| $0.10
| $0.10
| $0.10
| 182.8
| 49.5
| 118.8
| 468.1
| 2161.5
| 0.34
| 0.17
| 0.27
| 0.47
| 0.73
| Model Providers
|-
| '''[[Llama 3.2 3B]]'''
| Meta logo
| Open
| 128k
| 49
| 1103
| 0.64
| 0.21
| 0.50
| 0.60
| –
| –
| $0.06
| $0.06
| $0.06
| 202.2
| 42.4
| 144.0
| 543.6
| 1623.1
| 0.38
| 0.15
| 0.26
| 0.49
| 0.93
| Model Providers
|-
| '''[[Llama 3.2 1B]]'''
| Meta logo
| Open
| 128k
| 26
| 1054
| 0.35
| 0.14
| 0.13
| 0.40
| –
| –
| $0.04
| $0.04
| $0.04
| 315.4
| 179.4
| 261.4
| 2053.8
| 3394.4
| 0.44
| 0.20
| 0.30
| 0.51
| 0.74
| Model Providers
|-
| '''[[Gemini 2.0 Flash (exp)]]'''
| Google logo
| Proprietary
| 1m
| 82
| –
| 0.87
| 0.59
| 0.90
| 0.91
| –
| –
| $0.00
| $0.00
| $0.00
| 168.8
| 162.4
| 165.5
| 171.4
| 174.6
| 0.48
| 0.41
| 0.45
| 0.53
| 0.73
| Model Providers
|-
| '''[[Gemini 1.5 Pro (Sep)]]'''
| Google logo
| Proprietary
| 2m
| 80
| 1301
| 0.86
| 0.59
| 0.88
| 0.88
| –
| –
| $2.19
| $1.25
| $5.00
| 60.9
| 55.3
| 59.0
| 63.7
| 66.1
| 0.74
| 0.36
| 0.39
| 0.80
| 0.91
| Model Providers
|-
| '''[[Gemini 1.5 Flash (Sep)]]'''
| Google logo
| Proprietary
| 1m
| 72
| 1271
| 0.75
| 0.45
| 0.83
| 0.83
| –
| –
| $0.13
| $0.07
| $0.30
| 188.5
| 172.8
| 182.0
| 194.7
| 204.0
| 0.25
| 0.19
| 0.21
| 0.44
| 0.49
| Model Providers
|-
| '''[[Gemma 2 27B]]'''
| Google logo
| Open
| 8k
| 61
| 1219
| 0.77
| 0.39
| 0.54
| 0.76
| –
| –
| $0.26
| $0.17
| $0.51
| 58.7
| 32.0
| 46.3
| 67.8
| 74.4
| 0.51
| 0.19
| 0.32
| 1.53
| 5.94
| Model Providers
|-
| '''[[Gemma 2 9B]]'''
| Google logo
| Open
| 8k
| 55
| 1190
| 0.73
| 0.31
| 0.50
| 0.64
| –
| –
| $0.12
| $0.12
| $0.15
| 168.9
| 53.4
| 114.8
| 187.3
| 673.7
| 0.37
| 0.22
| 0.27
| 0.50
| 0.65
| Model Providers
|-
| '''[[Gemini 1.5 Flash (May)]]'''
| Google logo
| Proprietary
| 1m
| –
| 1227
| 0.79
| 0.39
| 0.55
| –
| –
| –
| $0.13
| $0.07
| $0.30
| 310.0
| 276.8
| 297.5
| 325.0
| 350.4
| 0.30
| 0.23
| 0.27
| 0.33
| 0.39
| Model Providers
|-
| '''[[Gemini Experimental (Nov)]]'''
| Google logo
| Proprietary
| 2m
| –
| 1365
| –
| –
| –
| –
| –
| –
| $0.00
| $0.00
| $0.00
| 53.8
| 51.1
| 52.6
| 55.1
| 56.4
| 1.12
| 0.78
| 0.90
| 1.82
| 3.31
| Model Providers
|-
| '''[[Gemini 1.5 Pro (May)]]'''
| Google logo
| Proprietary
| 2m
| –
| 1260
| 0.86
| 0.46
| 0.66
| –
| –
| –
| $2.19
| $1.25
| $5.00
| 66.9
| 62.7
| 64.6
| 68.4
| 70.2
| 0.50
| 0.38
| 0.42
| 0.81
| 0.88
| Model Providers
|-
| '''[[Gemini 1.5 Flash-8B]]'''
| Google logo
| Proprietary
| 1m
| –
| 1211
| 0.75
| 0.30
| 0.70
| –
| –
| –
| $0.07
| $0.04
| $0.15
| 279.0
| 226.9
| 260.1
| 288.8
| 300.0
| 0.39
| 0.27
| 0.35
| 0.46
| 0.59
| Model Providers
|-
| '''[[Claude 3.5 Sonnet (Oct)]]'''
| Anthropic logo
| Proprietary
| 200k
| 80
| 1282
| 0.89
| 0.58
| 0.76
| 0.96
| –
| –
| $6.00
| $3.00
| $15.00
| 71.8
| 37.6
| 44.8
| 78.0
| 89.6
| 0.98
| 0.68
| 0.78
| 1.36
| 2.23
| Model Providers
|-
| '''[[Claude 3.5 Sonnet (June)]]'''
| Anthropic logo
| Proprietary
| 200k
| 76
| 1268
| 0.88
| 0.56
| 0.71
| 0.90
| –
| –
| $6.00
| $3.00
| $15.00
| 61.4
| 41.6
| 49.9
| 78.9
| 91.0
| 0.87
| 0.68
| 0.75
| 1.06
| 1.45
| Model Providers
|-
| '''[[Claude 3 Opus]]'''
| Anthropic logo
| Proprietary
| 200k
| 70
| 1248
| 0.84
| 0.50
| 0.64
| 0.83
| –
| –
| $30.00
| $15.00
| $75.00
| 25.9
| 20.8
| 24.0
| 28.2
| 30.2
| 2.01
| 1.31
| 1.54
| 3.09
| 3.78
| Model Providers
|-
| '''[[Claude 3.5 Haiku]]'''
| Anthropic logo
| Proprietary
| 200k
| 68
| –
| 0.81
| 0.37
| 0.67
| 0.87
| –
| –
| $1.60
| $0.80
| $4.00
| 65.1
| 51.1
| 58.6
| 75.4
| 105.1
| 0.71
| 0.54
| 0.64
| 0.93
| 1.20
| Model Providers
|-
| '''[[Claude 3 Haiku]]'''
| Anthropic logo
| Proprietary
| 200k
| 55
| 1179
| 0.71
| 0.33
| 0.39
| 0.77
| –
| –
| $0.50
| $0.25
| $1.25
| 122.6
| 97.6
| 112.0
| 134.7
| 152.7
| 0.72
| 0.35
| 0.43
| 0.79
| 1.06
| Model Providers
|-
| '''[[Pixtral Large]]'''
| Mistral logo
| Open
| 128k
| 74
| –
| 0.85
| 0.52
| 0.71
| 0.88
| –
| –
| $3.00
| $2.00
| $6.00
| 36.6
| 18.9
| 33.6
| 39.3
| 41.1
| 0.40
| 0.33
| 0.34
| 0.51
| 2.22
| Model Providers
|-
| '''[[Mistral Large 2 (Jul '24)]]'''
| Mistral logo
| Open
| 128k
| 74
| 1251
| 0.85
| 0.48
| 0.71
| 0.91
| –
| –
| $3.00
| $2.00
| $6.00
| 31.2
| 25.8
| 29.2
| 33.8
| 35.4
| 0.50
| 0.38
| 0.45
| 0.55
| 0.89
| Model Providers
|-
| '''[[Mistral Large 2 (Nov '24)]]'''
| Mistral logo
| Open
| 128k
| 74
| –
| 0.85
| 0.47
| 0.72
| 0.90
| –
| –
| $3.00
| $2.00
| $6.00
| 37.4
| 32.2
| 36.0
| 41.0
| 66.3
| 0.52
| 0.33
| 0.45
| 0.56
| 0.73
| Model Providers
|-
| '''[[Mistral Small (Sep '24)]]'''
| Mistral logo
| Open
| 33k
| 61
| –
| 0.74
| 0.34
| 0.56
| 0.81
| –
| –
| $0.30
| $0.20
| $0.60
| 62.0
| 50.6
| 57.0
| 68.2
| 77.1
| 0.32
| 0.23
| 0.25
| 0.52
| 1.50
| Model Providers
|-
| '''[[Mixtral 8x22B]]'''
| Mistral logo
| Open
| 65k
| 61
| 1148
| 0.76
| 0.37
| 0.56
| 0.74
| –
| –
| $1.20
| $1.20
| $1.20
| 85.6
| 57.2
| 69.9
| 96.6
| 102.5
| 0.57
| 0.26
| 0.34
| 0.65
| 0.87
| Model Providers
|-
| '''[[Pixtral 12B]]'''
| Mistral logo
| Open
| 128k
| 56
| –
| 0.69
| 0.30
| 0.46
| 0.79
| –
| –
| $0.13
| $0.13
| $0.13
| 70.5
| 61.6
| 66.8
| 77.3
| 81.0
| 0.37
| 0.26
| 0.31
| 0.48
| 0.74
| Model Providers
|-
| '''[[Ministral 8B]]'''
| Mistral logo
| Open
| 128k
| 56
| 1183
| 0.59
| 0.30
| 0.57
| 0.79
| –
| –
| $0.10
| $0.10
| $0.10
| 135.9
| 121.0
| 131.3
| 138.5
| 141.5
| 0.29
| 0.23
| 0.25
| 0.34
| 0.60
| Model Providers
|-
| '''[[Mistral NeMo]]'''
| Mistral logo
| Open
| 128k
| 54
| –
| 0.66
| 0.33
| 0.44
| 0.71
| –
| –
| $0.09
| $0.06
| $0.14
| 124.0
| 51.8
| 76.5
| 158.8
| 163.6
| 0.48
| 0.18
| 0.25
| 0.55
| 0.71
| Model Providers
|-
| '''[[Ministral 3B]]'''
| Mistral logo
| Proprietary
| 128k
| 53
| –
| 0.58
| 0.26
| 0.54
| 0.74
| –
| –
| $0.04
| $0.04
| $0.04
| 168.3
| 147.8
| 163.9
| 173.1
| 177.7
| 0.29
| 0.22
| 0.26
| 0.34
| 0.55
| Model Providers
|-
| '''[[Mixtral 8x7B]]'''
| Mistral logo
| Open
| 33k
| 41
| 1114
| 0.63
| 0.30
| 0.31
| 0.38
| –
| –
| $0.50
| $0.45
| $0.50
| 110.1
| 56.9
| 89.9
| 149.1
| 550.3
| 0.36
| 0.21
| 0.29
| 0.48
| 0.69
| Model Providers
|-
| '''[[Codestral-Mamba]]'''
| Mistral logo
| Open
| 256k
| 33
| –
| 0.25
| 0.03
| 0.24
| 0.79
| –
| –
| $0.25
| $0.25
| $0.25
| 95.8
| 89.5
| 92.8
| 96.9
| 98.3
| 0.44
| 0.38
| 0.41
| 0.48
| 0.68
| Model Providers
|-
| '''[[Command-R+]]'''
| Cohere logo
| Open
| 128k
| 55
| 1215
| 0.75
| 0.34
| 0.40
| 0.71
| –
| –
| $5.19
| $2.75
| $12.50
| 50.7
| 45.4
| 47.5
| 73.9
| 79.7
| 0.47
| 0.23
| 0.27
| 0.51
| 0.64
| Model Providers
|-
| '''[[Command-R+ (Apr '24)]]'''
| Cohere logo
| Open
| 128k
| 45
| 1190
| 0.68
| 0.24
| 0.27
| 0.62
| –
| –
| $6.00
| $3.00
| $15.00
| 49.3
| 45.6
| 47.4
| 64.9
| 77.2
| 0.51
| 0.25
| 0.34
| 0.58
| 0.64
| Model Providers
|-
| '''[[Command-R (Mar '24)]]'''
| Cohere logo
| Open
| 128k
| 36
| 1149
| 0.59
| 0.26
| 0.16
| 0.44
| –
| –
| $0.75
| $0.50
| $1.50
| 108.2
| 75.9
| 80.3
| 167.6
| 178.4
| 0.36
| 0.15
| 0.25
| 0.44
| 0.50
| Model Providers
|-
| '''[[Aya Expanse 8B]]'''
| Cohere logo
| Open
| 8k
| –
| –
| –
| –
| –
| –
| –
| –
| $0.75
| $0.50
| $1.50
| 165.6
| 157.5
| 162.0
| 169.3
| 173.7
| 0.16
| 0.12
| 0.14
| 0.21
| 0.32
| Model Providers
|-
| '''[[Command-R]]'''
| Cohere logo
| Open
| 128k
| –
| 1179
| 0.67
| 0.27
| –
| 0.70
| –
| –
| $0.51
| $0.33
| $1.05
| 111.8
| 102.7
| 107.6
| 117.7
| 124.9
| 0.32
| 0.15
| 0.21
| 0.36
| 0.39
| Model Providers
|-
| '''[[Aya Expanse 32B]]'''
| Cohere logo
| Open
| 128k
| –
| 1207
| 0.67
| –
| –
| –
| –
| –
| $0.75
| $0.50
| $1.50
| 120.3
| 114.9
| 118.4
| 123.3
| 127.5
| 0.18
| 0.15
| 0.16
| 0.25
| 0.32
| Model Providers
|-
| '''[[Sonar 3.1 Small]]'''
| Perplexity logo
| Proprietary
| 127k
| –
| –
| –
| –
| –
| –
| –
| –
| $0.20
| $0.20
| $0.20
| 203.8
| 182.9
| 201.0
| 205.5
| 206.9
| 0.35
| 0.29
| 0.30
| 0.37
| 0.47
| Model Providers
|-
| '''[[Sonar 3.1 Large]]'''
| Perplexity logo
| Proprietary
| 127k
| –
| –
| –
| –
| –
| –
| –
| –
| $1.00
| $1.00
| $1.00
| 57.8
| 45.7
| 54.0
| 60.8
| 64.6
| 0.31
| 0.29
| 0.30
| 0.36
| 0.46
| Model Providers
|-
| '''[[Grok Beta]]'''
| xAI logo
| Proprietary
| 128k
| 72
| 1289
| 0.85
| 0.43
| 0.73
| 0.87
| –
| –
| $7.50
| $5.00
| $15.00
| 66.5
| 56.9
| 64.4
| 67.6
| 68.8
| 0.42
| 0.34
| 0.38
| 0.47
| 0.56
| Model Providers
|-
| '''[[Nova Pro]]'''
| Amazon logo
| Proprietary
| 300k
| 75
| –
| 0.84
| 0.48
| 0.79
| 0.88
| –
| –
| $1.40
| $0.80
| $3.20
| 91.3
| 77.6
| 82.4
| 96.4
| 102.5
| 0.38
| 0.35
| 0.37
| 0.39
| 0.42
| Model Providers
|-
| '''[[Nova Lite]]'''
| Amazon logo
| Proprietary
| 300k
| 70
| –
| 0.79
| 0.43
| 0.75
| 0.84
| –
| –
| $0.10
| $0.06
| $0.24
| 148.0
| 126.4
| 134.8
| 156.7
| 165.5
| 0.33
| 0.30
| 0.32
| 0.35
| 0.38
| Model Providers
|-
| '''[[Nova Micro]]'''
| Amazon logo
| Proprietary
| 130k
| 66
| –
| 0.76
| 0.38
| 0.69
| 0.80
| –
| –
| $0.06
| $0.04
| $0.14
| 195.8
| 170.9
| 186.0
| 208.3
| 219.5
| 0.33
| 0.30
| 0.32
| 0.35
| 0.39
| Model Providers
|-
| '''[[Phi-4]]'''
| Microsoft Azure logo
| Open
| 16k
| 77
| –
| 0.85
| 0.57
| 0.81
| 0.87
| –
| –
| $0.09
| $0.07
| $0.14
| 85.1
| 76.2
| 82.0
| 85.4
| 85.6
| 0.21
| 0.16
| 0.18
| 0.23
| 0.25
| Model Providers
|-
| '''[[Phi-3 Mini]]'''
| Microsoft Azure logo
| Open
| 4k
| –
| 1037
| –
| –
| –
| –
| –
| –
| $0.00
| $0.00
| $0.00
| –
| –
| –
| –
| –
| –
| –
| –
| –
| –
| Model Providers
|-
| '''[[Phi-3 Medium 14B]]'''
| Microsoft Azure logo
| Open
| 128k
| –
| 1123
| –
| –
| –
| –
| –
| –
| $0.30
| $0.17
| $0.68
| 50.8
| 17.6
| 44.9
| 52.7
| 54.2
| 0.43
| 0.38
| 0.42
| 0.45
| 0.49
| Model Providers
|-
| '''[[Solar Mini]]'''
| Upstage logo
| Open
| 4k
| 47
| 1062
| 0.66
| 0.28
| 0.33
| 0.59
| –
| –
| $0.15
| $0.15
| $0.15
| 89.1
| 84.3
| 87.7
| 90.5
| 92.5
| 1.12
| 1.07
| 1.12
| 1.14
| 1.42
| Model Providers
|-
| '''[[DBRX]]'''
| Databricks logo
| Open
| 33k
| 46
| 1103
| 0.70
| 0.31
| 0.28
| 0.54
| –
| –
| $1.16
| $0.97
| $1.73
| 74.2
| 50.2
| 68.0
| 82.9
| 83.1
| 0.44
| 0.27
| 0.32
| 0.51
| 0.71
| Model Providers
|-
| '''[[Llama 3.1 Nemotron 70B]]'''
| NVIDIA logo
| Open
| 128k
| 72
| 1269
| 0.86
| 0.48
| 0.73
| 0.81
| –
| –
| $0.27
| $0.23
| $0.40
| 48.3
| 27.4
| 44.3
| 69.8
| 71.0
| 0.57
| 0.23
| 0.32
| 0.64
| 0.79
| Model Providers
|-
| '''[[Reka Flash]]'''
| Reka AI logo
| Proprietary
| 128k
| 59
| –
| 0.73
| 0.34
| 0.53
| 0.74
| –
| –
| $0.35
| $0.20
| $0.80
| –
| –
| –
| –
| –
| –
| –
| –
| –
| –
| Model Providers
|-
| '''[[Reka Core]]'''
| Reka AI logo
| Proprietary
| 128k
| 58
| 1199
| 0.76
| 0.28
| 0.56
| 0.73
| –
| –
| $2.00
| $2.00
| $2.00
| –
| –
| –
| –
| –
| –
| –
| –
| –
| –
| Model Providers
|-
| '''[[Reka Flash (Feb '24)]]'''
| Reka AI logo
| Proprietary
| 128k
| 46
| 1165
| 0.65
| 0.27
| 0.33
| 0.61
| –
| –
| $0.35
| $0.20
| $0.80
| –
| –
| –
| –
| –
| –
| –
| –
| –
| –
| Model Providers
|-
| '''[[Reka Edge]]'''
| Reka AI logo
| Proprietary
| 128k
| 31
| –
| 0.44
| 0.19
| 0.22
| 0.41
| –
| –
| $0.10
| $0.10
| $0.10
| –
| –
| –
| –
| –
| –
| –
| –
| –
| –
| Model Providers
|-
| '''[[Jamba 1.5 Large]]'''
| AI21 Labs logo
| Open
| 256k
| 64
| 1221
| 0.80
| 0.41
| 0.61
| 0.74
| –
| –
| $3.50
| $2.00
| $8.00
| 50.8
| 43.9
| 49.9
| 51.8
| 58.4
| 0.72
| 0.49
| 0.68
| 0.74
| 0.81
| Model Providers
|-
| '''[[Jamba 1.5 Mini]]'''
| AI21 Labs logo
| Open
| 256k
| –
| 1176
| 0.63
| 0.26
| 0.61
| –
|
|
| $0.25
| $0.20
| $0.40
| 83.5
| 78.0
| 82.0
| 164.1
| 190.3
| 0.48
| 0.29
| 0.44
| 0.51
| 0.55
| Model Providers
|-
| '''[[DeepSeek V3]]'''
| DeepSeek logo
| Open
| 128k
| 80
| –
| 0.87
| 0.56
| 0.85
| 0.92
| –
| –
| $0.90
| $0.90
| $1.10
| 21.1
| 7.9
| 10.8
| 48.3
| 73.1
| 0.95
| 0.52
| 0.67
| 1.24
| 10.32
| Model Providers
|-
| '''[[DeepSeek-V2.5 (Dec '24)]]'''
| DeepSeek logo
| Open
| 128k
| 72
| –
| 0.81
| 0.42
| 0.76
| 0.88
| –
| –
| $0.17
| $0.14
| $0.28
| 64.8
| 52.7
| 57.3
| 70.9
| 81.1
| 1.16
| 0.90
| 1.02
| 1.37
| 1.87
| Model Providers
|-
| '''[[DeepSeek-Coder-V2]]'''
| DeepSeek logo
| Open
| 128k
| 71
| 1178
| 0.80
| 0.42
| 0.74
| 0.87
| –
| –
| $0.17
| $0.14
| $0.28
| 64.4
| 51.8
| 57.3
| 71.4
| 81.1
| 1.12
| 0.84
| 0.99
| 1.27
| 1.71
| Model Providers
|-
| '''[[DeepSeek-V2.5]]'''
| DeepSeek logo
| Open
| 128k
| –
| 1258
| 0.81
| 0.42
| –
| 0.87
| –
| –
| $1.09
| $1.07
| $1.14
| 7.6
| 6.9
| 7.2
| 8.0
| 8.2
| 0.77
| 0.60
| 0.70
| 0.88
| 17.15
| Model Providers
|-
| '''[[DeepSeek-V2]]'''
| DeepSeek logo
| Open
| 128k
| –
| 1220
| 0.80
| 0.42
| –
| 0.87
| –
| –
| $0.17
| $0.14
| $0.28
| –
| –
| –
| –
| –
| –
| –
| –
| –
| –
| Model Providers
|-
| '''[[Arctic]]'''
| Snowflake logo
| Open
| 4k
| 51
| 1090
| 0.63
| 0.26
| –
| 0.75
| –
| –
| $0.00
| $0.00
| $0.00
| –
| –
| –
| –
| –
| –
| –
| –
| –
| –
| Model Providers
|-
| '''[[Qwen2.5 72B]]'''
| Alibaba logo
| Open
| 131k
| 77
| 1259
| 0.86
| 0.50
| 0.84
| 0.89
| –
| –
| $0.40
| $0.40
| $0.75
| 65.6
| 18.5
| 39.6
| 80.8
| 242.6
| 0.54
| 0.25
| 0.38
| 0.68
| 2.57
| Model Providers
|-
| '''[[Qwen2.5 Coder 32B]]'''
| Alibaba logo
| Open
| 131k
| 72
| 1220
| 0.79
| 0.41
| 0.76
| 0.91
| –
| –
| $0.80
| $0.80
| $0.80
| 84.1
| 34.7
| 45.4
| 102.8
| 344.0
| 0.37
| 0.21
| 0.28
| 0.51
| 1.25
| Model Providers
|-
| '''[[Qwen2 72B]]'''
| Alibaba logo
| Open
| 131k
| 72
| 1187
| 0.83
| 0.40
| 0.77
| 0.86
| –
| –
| $0.63
| $0.63
| $0.65
| 45.7
| 32.8
| 37.7
| 64.2
| 67.0
| 0.30
| 0.23
| 0.27
| 0.36
| 0.62
| Model Providers
|-
| '''[[QwQ 32B-Preview]]'''
| Alibaba logo
| Open
| 33k
| 46
| –
| 0.07
| 0.01
| 0.91
| 0.85
| –
| –
| $0.26
| $0.20
| $0.60
| 66.3
| 35.0
| 52.1
| 105.2
| 329.9
| 0.40
| 0.21
| 0.32
| 0.58
| 2.05
| Model Providers
|-
| '''[[Yi-Large]]'''
| 01.AI logo
| Proprietary
| 32k
| 61
| 1213
| 0.78
| 0.33
| 0.56
| 0.77
| –
| –
| $3.00
| $3.00
| $3.00
| 67.4
| 63.1
| 64.5
| 78.9
| 81.8
| 0.47
| 0.35
| 0.40
| 0.78
| 1.64
| Model Providers
|-
| '''[[GPT-4 Turbo]]'''
| OpenAI logo
| Proprietary
| 128k
| 75
| 1256
| 0.87
| 0.50
| 0.74
| 0.92
| –
| –
| $15.00
| $10.00
| $30.00
| 43.8
| 22.7
| 35.3
| 52.8
| 58.7
| 1.19
| 0.51
| 0.63
| 1.44
| 2.02
| Model Providers
|-
| '''[[GPT-4]]'''
| OpenAI logo
| Proprietary
| 8k
| –
| 1186
| –
| –
| –
| –
| –
| –
| $37.50
| $30.00
| $60.00
| 30.5
| 14.2
| 20.2
| 37.9
| 44.9
| 0.73
| 0.50
| 0.57
| 0.91
| 1.30
| Model Providers
|-
| '''[[Llama 3 70B]]'''
| Meta logo
| Open
| 8k
| 48
| 1206
| 0.79
| 0.39
| 0.53
| 0.19
| –
| –
| $0.88
| $0.80
| $0.88
| 48.2
| 19.3
| 32.9
| 130.8
| 349.0
| 0.40
| 0.23
| 0.30
| 0.55
| 1.36
| Model Providers
|-
| '''[[Llama 3 8B]]'''
| Meta logo
| Open
| 8k
| 45
| 1152
| 0.64
| 0.30
| 0.32
| 0.53
| –
| –
| $0.10
| $0.07
| $0.20
| 109.4
| 63.1
| 74.0
| 202.4
| 1203.6
| 0.35
| 0.19
| 0.30
| 0.41
| 0.73
| Model Providers
|-
| '''[[Llama 2 Chat 70B]]'''
| Meta logo
| Open
| 4k
| –
| 1093
| –
| –
| –
| –
| –
| –
| $1.85
| $1.75
| $2.17
| –
| –
| –
| –
| –
| –
| –
| –
| –
| –
| Model Providers
|-
| '''[[Llama 2 Chat 13B]]'''
| Meta logo
| Open
| 4k
| –
| 1063
| –
| –
| –
| –
| –
| –
| $0.00
| $0.00
| $0.00
| –
| –
| –
| –
| –
| –
| –
| –
| –
| –
| Model Providers
|-
| '''[[Llama 2 Chat 7B]]'''
| Meta logo
| Open
| 4k
| –
| 1037
| –
| –
| –
| –
| –
| –
| $0.33
| $0.29
| $0.46
| 123.8
| 119.1
| 122.3
| 126.0
| 130.1
| 0.37
| 0.34
| 0.35
| 0.40
| 0.55
| Model Providers
|-
| '''[[Gemini 1.0 Pro]]'''
| Google logo
| Proprietary
| 33k
| –
| 1111
| –
| –
| –
| –
| –
| –
| $0.75
| $0.50
| $1.50
| 103.1
| 97.1
| 98.7
| 105.2
| 107.5
| 1.28
| 1.20
| 1.24
| 1.31
| 1.39
| Model Providers
|-
| '''[[Claude 3 Sonnet]]'''
| Anthropic logo
| Proprietary
| 200k
| 57
| 1201
| 0.77
| 0.37
| 0.41
| 0.71
| –
| –
| $6.00
| $3.00
| $15.00
| 67.9
| 35.2
| 58.5
| 76.6
| 89.7
| 0.75
| 0.61
| 0.65
| 0.86
| 1.46
| Model Providers
|-
| '''[[Claude 2.1]]'''
| Anthropic logo
| Proprietary
| 200k
| –
| 1118
| –
| –
| –
| –
| –
| –
| $12.00
| $8.00
| $24.00
| 20.3
| 13.0
| 13.5
| 29.4
| 31.0
| 1.41
| 0.79
| 0.82
| 1.78
| 2.01
| Model Providers
|-
| '''[[Claude 2.0]]'''
| Anthropic logo
| Proprietary
| 100k
| –
| 1132
| –
| –
| –
| –
| –
| –
| $12.00
| $8.00
| $24.00
| 29.9
| 28.7
| 29.4
| 30.5
| 32.6
| 0.81
| 0.78
| 0.80
| 0.83
| 0.95
| Model Providers
|-
| '''[[Mistral Small (Feb '24)]]'''
| Mistral logo
| Proprietary
| 33k
| 59
| –
| 0.69
| 0.31
| 0.56
| 0.79
| –
| –
| $1.50
| $1.00
| $3.00
| 53.7
| 48.8
| 52.1
| 61.6
| 73.2
| 0.38
| 0.25
| 0.33
| 0.41
| 0.62
| Model Providers
|-
| '''[[Mistral Large (Feb '24)]]'''
| Mistral logo
| Proprietary
| 33k
| 56
| 1157
| 0.69
| 0.36
| 0.49
| 0.70
| –
| –
| $6.00
| $4.00
| $12.00
| 38.8
| 29.8
| 36.8
| 42.5
| 45.4
| 0.43
| 0.34
| 0.40
| 0.52
| 0.98
| Model Providers
|-
| '''[[Mistral 7B]]'''
| Mistral logo
| Open
| 8k
| 28
| 1008
| 0.33
| 0.19
| 0.16
| 0.42
| –
| –
| $0.12
| $0.11
| $0.14
| 101.4
| 76.1
| 92.8
| 127.4
| 156.1
| 0.30
| 0.15
| 0.22
| 0.35
| 0.96
| Model Providers
|-
| '''[[Mistral Medium]]'''
| Mistral logo
| Proprietary
| 33k
| –
| 1148
| –
| –
| –
| –
| –
| –
| $4.09
| $2.75
| $8.10
| 44.5
| 40.3
| 42.7
| 45.3
| 46.8
| 0.38
| 0.31
| 0.33
| 0.45
| 17.22
| Model Providers
|-
| '''[[Codestral]]'''
| Mistral logo
| Open
| 33k
| –
| –
| 0.23
| –
| 0.80
| –
| –
| –
| $0.30
| $0.20
| $0.60
| 84.8
| 78.1
| 82.1
| 86.5
| 88.8
| 0.28
| 0.24
| 0.26
| 0.31
| 0.49
| Model Providers
|-
| '''[[OpenChat 3.5]]'''
| OpenChat logo
| Open
| 8k
| 44
| 1076
| 0.56
| 0.22
| 0.31
| 0.68
| –
| –
| $0.06
| $0.06
| $0.06
| 73.3
| 66.3
| 69.3
| 76.3
| 80.3
| 0.30
| 0.24
| 0.27
| 0.32
| 0.37
| Model Providers
|-
| '''[[Jamba Instruct]]'''
| AI21 Labs logo
| Proprietary
| 256k
| –
| –
| 0.58
| 0.25
| –
| –
| –
| –
| $0.55
| $0.50
| $0.70
| 77.1
| 70.4
| 74.3
| 169.6
| 193.7
| 0.52
| 0.29
| 0.45
| 0.54
| 0.58
| Model Providers
|}
|}
<ref name="”1”">LLM Leaderboard - Comparison of GPT-4o, Llama 3, Mistral, Gemini and over 30 models https://artificialanalysis.ai/leaderboards/models</ref>


==Terms==
==Terms==
Line 3,340: Line 210:
*'''Input Price''': Price per token included in the request/message sent to the API, represented as USD per million Tokens.
*'''Input Price''': Price per token included in the request/message sent to the API, represented as USD per million Tokens.
*'''Time period''': Metrics are 'live' and are based on the past 14 days of measurements, measurements are taken 8 times a day for single requests and 2 times per day for parallel requests.
*'''Time period''': Metrics are 'live' and are based on the past 14 days of measurements, measurements are taken 8 times a day for single requests and 2 times per day for parallel requests.
<ref name="”1”">LLM Leaderboard - Comparison of GPT-4o, Llama 3, Mistral, Gemini and over 30 models https://artificialanalysis.ai/leaderboards/models</ref>


==References==
==References==
<references />
[[Category:Important]] [[Category:Rankings]] [[Category:Aggregate pages]]

Latest revision as of 21:03, 13 January 2025

See also: LLM Benchmarks Timeline and LLM Rankings

Compare different large language models (LLMs).

Model Creator Context Window Quality Index
(Normalized avg)
Blended
(USD/1M Tokens)
Median
(Tokens/s)
Median
(First Chunk (s))
o1-preview OpenAI 128k 86 $27.56 143.7 21.33
o1-mini OpenAI 128k 84 $5.25 213.2 11.27
GPT-4o (Aug '24) OpenAI 128k 78 $4.38 83.5 0.67
GPT-4o (May '24) OpenAI 128k 78 $7.50 106.3 0.65
GPT-4o mini OpenAI 128k 73 $0.26 113.8 0.64
GPT-4o (Nov '24) OpenAI 128k 73 $4.38 116.4 0.39
GPT-4o mini Realtime (Dec '24) OpenAI 128k $0.00
GPT-4o Realtime (Dec '24) OpenAI 128k $0.00
Llama 3.3 70B Meta 128k 74 $0.69 71.8 0.49
Llama 3.1 405B Meta 128k 74 $3.50 30.2 0.71
Llama 3.1 70B Meta 128k 68 $0.72 72.8 0.44
Llama 3.2 90B (Vision) Meta 128k 68 $0.81 48.9 0.33
Llama 3.2 11B (Vision) Meta 128k 54 $0.18 131.2 0.28
Llama 3.1 8B Meta 128k 54 $0.10 184.9 0.33
Llama 3.2 3B Meta 128k 49 $0.06 201.4 0.38
Llama 3.2 1B Meta 128k 26 $0.04 468.6 0.37
Gemini 2.0 Flash (exp) Google 1m 82 $0.00 169.0 0.48
Gemini 1.5 Pro (Sep) Google 2m 80 $2.19 60.8 0.74
Gemini 1.5 Flash (Sep) Google 1m 72 $0.13 188.4 0.25
Gemma 2 27B Google 8k 61 $0.26 59.4 0.48
Gemma 2 9B Google 8k 55 $0.12 168.9 0.36
Gemini 1.5 Flash (May) Google 1m $0.13 310.6 0.29
Gemini Experimental (Nov) Google 2m $0.00 53.9 1.12
Gemini 1.5 Pro (May) Google 2m $2.19 66.9 0.49
Gemini 1.5 Flash-8B Google 1m $0.07 279.7 0.38
Claude 3.5 Sonnet (Oct) Anthropic 200k 80 $6.00 72.0 0.99
Claude 3.5 Sonnet (June) Anthropic 200k 76 $6.00 61.5 0.87
Claude 3 Opus Anthropic 200k 70 $30.00 25.9 2.00
Claude 3.5 Haiku Anthropic 200k 68 $1.60 65.1 0.71
Claude 3 Haiku Anthropic 200k 55 $0.50 121.6 0.72
Pixtral Large Mistral 128k 74 $3.00 36.5 0.40
Mistral Large 2 (Jul '24) Mistral 128k 74 $3.00 31.1 0.50
Mistral Large 2 (Nov '24) Mistral 128k 74 $3.00 37.4 0.52
Mistral Small (Sep '24) Mistral 33k 61 $0.30 61.5 0.32
Mixtral 8x22B Mistral 65k 61 $1.20 85.1 0.57
Pixtral 12B Mistral 128k 56 $0.13 70.3 0.37
Ministral 8B Mistral 128k 56 $0.10 136.1 0.30
Mistral NeMo Mistral 128k 54 $0.09 122.5 0.48
Ministral 3B Mistral 128k 53 $0.04 168.5 0.29
Mixtral 8x7B Mistral 33k 41 $0.50 110.6 0.36
Codestral-Mamba Mistral 256k 33 $0.25 95.8 0.44
Command-R+ Cohere 128k 55 $5.19 50.7 0.47
Command-R+ (Apr '24) Cohere 128k 45 $6.00 49.3 0.51
Command-R (Mar '24) Cohere 128k 36 $0.75 108.1 0.36
Aya Expanse 8B Cohere 8k $0.75 165.5 0.16
Command-R Cohere 128k $0.51 111.8 0.32
Aya Expanse 32B Cohere 128k $0.75 120.4 0.18
Sonar 3.1 Small Perplexity 127k $0.20 203.8 0.34
Sonar 3.1 Large Perplexity 127k $1.00 57.7 0.31
Grok Beta xAI 128k 72 $7.50 66.7 0.42
Nova Pro Amazon 300k 75 $1.40 91.0 0.38
Nova Lite Amazon 300k 70 $0.10 148.0 0.33
Nova Micro Amazon 130k 66 $0.06 195.5 0.33
Phi-4 Microsoft Azure 16k 77 $0.09 85.0 0.22
Phi-3 Mini Microsoft Azure 4k $0.00
Phi-3 Medium 14B Microsoft Azure 128k $0.30 50.4 0.43
Solar Mini Upstage 4k 47 $0.15 89.3 1.13
DBRX Databricks 33k 46 $1.16 78.3 0.42
Llama 3.1 Nemotron 70B NVIDIA 128k 72 $0.27 48.3 0.57
Reka Flash Reka AI 128k 59 $0.35
Reka Core Reka AI 128k 58 $2.00
Reka Flash (Feb '24) Reka AI 128k 46 $0.35
Reka Edge Reka AI 128k 31 $0.10
Jamba 1.5 Large AI21 Labs 256k 64 $3.50 51.0 0.71
Jamba 1.5 Mini AI21 Labs 256k $0.25 83.7 0.48
DeepSeek V3 DeepSeek 128k 80 $0.90 20.9 0.94
DeepSeek-V2.5 (Dec '24) DeepSeek 128k 72 $0.17 61.8 1.15
DeepSeek-Coder-V2 DeepSeek 128k 71 $0.17 62.0 1.11
DeepSeek-V2.5 DeepSeek 128k $1.09 7.6 0.77
DeepSeek-V2 DeepSeek 128k $0.17
Arctic Snowflake 4k 51 $0.00
Qwen2.5 72B Alibaba 131k 77 $0.40 67.6 0.53
Qwen2.5 Coder 32B Alibaba 131k 72 $0.80 84.0 0.38
Qwen2 72B Alibaba 131k 72 $0.63 46.5 0.30
QwQ 32B-Preview Alibaba 33k 46 $0.26 67.3 0.40
Yi-Large 01.AI 32k 61 $3.00 68.1 0.47
GPT-4 Turbo OpenAI 128k 75 $15.00 43.3 1.20
GPT-4 OpenAI 8k $37.50 28.4 0.75
Llama 3 70B Meta 8k 48 $0.89 48.9 0.38
Llama 3 8B Meta 8k 45 $0.15 117.3 0.34
Llama 2 Chat 70B Meta 4k $1.85
Llama 2 Chat 13B Meta 4k $0.00
Llama 2 Chat 7B Meta 4k $0.33 123.7 0.37
Gemini 1.0 Pro Google 33k $0.75 102.9 1.27
Claude 3 Sonnet Anthropic 200k 57 $6.00 68.2 0.76
Claude 2.1 Anthropic 200k $12.00 14.1 1.24
Claude 2.0 Anthropic 100k $12.00 29.9 0.81
Mistral Small (Feb '24) Mistral 33k 59 $1.50 53.5 0.37
Mistral Large (Feb '24) Mistral 33k 56 $6.00 38.9 0.44
Mistral 7B Mistral 8k 28 $0.16 112.5 0.26
Mistral Medium Mistral 33k $4.09 44.7 0.37
Codestral Mistral 33k $0.30 84.9 0.28
OpenChat 3.5 OpenChat 8k 44 $0.06 73.5 0.30
Jamba Instruct AI21 Labs 256k $0.55 77.4 0.52

[1]

Terms

  • Artificial Analysis Quality Index: Average result across our evaluations covering different dimensions of model intelligence. Currently includes MMLU, GPQA, Math & HumanEval. OpenAI o1 model figures are preliminary and are based on figures stated by OpenAI. See methodology for more details.
  • Context window: Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).
  • Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
  • Latency: Time to first token of tokens received, in seconds, after API request sent. For models which do not support streaming, this represents time to receive the completion.
  • Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
  • Output Price: Price per token generated by the model (received from the API), represented as USD per million Tokens.
  • Input Price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.
  • Time period: Metrics are 'live' and are based on the past 14 days of measurements, measurements are taken 8 times a day for single requests and 2 times per day for parallel requests.

[1]

References

  1. 1.0 1.1 LLM Leaderboard - Comparison of GPT-4o, Llama 3, Mistral, Gemini and over 30 models https://artificialanalysis.ai/leaderboards/models