LLM Comparisons: Difference between revisions
Line 227: | Line 227: | ||
! Further Analysis | ! Further Analysis | ||
|- | |- | ||
| '''[[o1-preview]]''' | | '''[[o1-preview]]''' || OpenAI || Proprietary || 128k || 86 || 1334 || 0.91 || 0.67 || 0.92 || 0.96 || $27.56 || $15.75 || $63.00 || 143.8 || 68.9 || 121.6 || 164.6 || 179.6 || 21.28 || 13.40 || 17.04 || 27.80 || 46.49 || – | ||
| OpenAI | |||
| Proprietary | |||
| 128k | |||
| 86 | |||
| 1334 | |||
| 0.91 | |||
| 0.67 | |||
| 0.92 | |||
| 0.96 | |||
| $27.56 | |||
| $15.75 | |||
| $63.00 | |||
| 143.8 | |||
| 68.9 | |||
| 121.6 | |||
| 164.6 | |||
| 179.6 | |||
| 21.28 | |||
| 13.40 | |||
| 17.04 | |||
| 27.80 | |||
| 46.49 | |||
| | |||
| – | |||
|- | |- | ||
| '''[[o1-mini]]''' | | '''[[o1-mini]]''' || OpenAI || Proprietary || 128k || 84 || 1308 || 0.85 || 0.58 || 0.95 || 0.97 || $5.25 || $3.00 || $12.00 || 213.6 || 84.0 || 154.8 || 238.0 || 299.4 || 11.75 || 2.44 || 9.40 || 14.43 || 24.03 || – | ||
| OpenAI | |||
| Proprietary | |||
| 128k | |||
| 84 | |||
| 1308 | |||
| 0.85 | |||
| 0.58 | |||
| 0.95 | |||
| 0.97 | |||
| $5.25 | |||
| $3.00 | |||
| $12.00 | |||
| 213.6 | |||
| 84.0 | |||
| 154.8 | |||
| 238.0 | |||
| 299.4 | |||
| 11.75 | |||
| 2.44 | |||
| 9.40 | |||
| 14.43 | |||
| 24.03 | |||
| | |||
| – | |||
|- | |- | ||
| '''[[GPT-4o (Aug '24)]]''' | | '''[[GPT-4o (Aug '24)]]''' || OpenAI || Proprietary || 128k || 78 || 1337 || 0.89 || 0.51 || 0.80 || 0.93 || – || – || $4.38 || $2.50 || $10.00 || 85.6 || 40.3 || 61.5 || 109.3 || 143.6 || 0.66 || 0.33 || 0.43 || 0.91 || 1.92 | ||
| OpenAI | |||
| Proprietary | |||
| 128k | |||
| 78 | |||
| 1337 | |||
| 0.89 | |||
| 0.51 | |||
| 0.80 | |||
| 0.93 | |||
| – | |||
| – | |||
| $4.38 | |||
| $2.50 | |||
| $10.00 | |||
| 85.6 | |||
| 40.3 | |||
| 61.5 | |||
| 109.3 | |||
| 143.6 | |||
| 0.66 | |||
| 0.33 | |||
| 0.43 | |||
| 0.91 | |||
| 1.92 | |||
|- | |- | ||
| '''[[GPT-4o (May '24)]]''' | | '''[[GPT-4o (May '24)]]''' || OpenAI || Proprietary || 128k || 78 || 1285 || 0.87 || 0.51 || 0.79 || 0.93 || – || – || $7.50 || $5.00 || $15.00 || 106.8 || 53.2 || 82.2 || 126.8 || 142.5 || 0.65 || 0.32 || 0.43 || 0.73 || 1.22 | ||
| OpenAI | |||
| Proprietary | |||
| 128k | |||
| 78 | |||
| 1285 | |||
| 0.87 | |||
| 0.51 | |||
| 0.79 | |||
| 0.93 | |||
| – | |||
| – | |||
| $7.50 | |||
| $5.00 | |||
| $15.00 | |||
| 106.8 | |||
| 53.2 | |||
| 82.2 | |||
| 126.8 | |||
| 142.5 | |||
| 0.65 | |||
| 0.32 | |||
| 0.43 | |||
| 0.73 | |||
| 1.22 | |||
|- | |- | ||
| '''[[GPT-4o mini]]''' | | '''[[GPT-4o mini]]''' || OpenAI || Proprietary || 128k || 73 || 1273 || 0.82 || 0.44 || 0.79 || 0.88 || – || – || $0.26 || $0.15 || $0.60 || 121.8 || 50.7 || 74.1 || 179.4 || 206.5 || 0.65 || 0.30 || 0.39 || 0.77 || 0.92 | ||
| OpenAI | |||
| Proprietary | |||
| 128k | |||
| 73 | |||
| 1273 | |||
| 0.82 | |||
| 0.44 | |||
| 0.79 | |||
| 0.88 | |||
| – | |||
| – | |||
| $0.26 | |||
| $0.15 | |||
| $0.60 | |||
| 121.8 | |||
| 50.7 | |||
| 74.1 | |||
| 179.4 | |||
| 206.5 | |||
| 0.65 | |||
| 0.30 | |||
| 0.39 | |||
| 0.77 | |||
| 0.92 | |||
|- | |- | ||
| '''[[ | | '''[[Claude 3.5 Sonnet (Oct)]]''' || Anthropic || Proprietary || 200k || 80 || 1282 || 0.89 || 0.58 || 0.76 || 0.96 || – || – || $6.00 || $3.00 || $15.00 || 71.8 || 37.6 || 44.8 || 78.0 || 89.6 || 0.98 || 0.68 || 0.78 || 1.36 || 2.23 | ||
| | |||
| Proprietary | |||
| | |||
| | |||
| | |||
| 0. | |||
| 0. | |||
| 0. | |||
| 0. | |||
| – | |||
| – | |||
| $ | |||
| $ | |||
| $ | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| 0. | |||
| 0. | |||
| 0. | |||
| | |||
| | |||
|- | |- | ||
| '''[[ | | '''[[Claude 3.5 Sonnet (June)]]''' || Anthropic || Proprietary || 200k || 76 || 1268 || 0.88 || 0.56 || 0.71 || 0.90 || – || – || $6.00 || $3.00 || $15.00 || 61.4 || 41.6 || 49.9 || 78.9 || 91.0 || 0.87 || 0.68 || 0.75 || 1.06 || 1.45 | ||
| | |||
| Proprietary | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| – | |||
| | |||
| – | |||
| | |||
| $ | |||
| $ | |||
| $ | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
|- | |- | ||
| '''[[ | | '''[[Claude 3.5 Haiku]]''' || Anthropic || Proprietary || 200k || 68 || – || 0.81 || 0.37 || 0.67 || 0.87 || – || – || $1.60 || $0.80 || $4.00 || 65.1 || 51.1 || 58.6 || 75.4 || 105.1 || 0.71 || 0.54 || 0.64 || 0.93 || 1.20 | ||
| | |||
| Proprietary | |||
| | |||
| – | |||
| | |||
| | |||
| | |||
| – | |||
| | |||
| – | |||
| | |||
| $ | |||
| $0. | |||
| $ | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
|- | |- | ||
| '''[[Llama 3.3 70B]]''' | | '''[[Llama 3.3 70B]]''' || Meta || Open || 128k || 74 || – || 0.86 || 0.49 || 0.76 || 0.86 || – || – || $0.67 || $0.59 || $0.73 || 67.2 || 23.6 || 31.2 || 275.7 || 2046.5 || 0.51 || 0.23 || 0.36 || 0.72 || 1.48 | ||
| Meta | |||
| Open | |||
| 128k | |||
| 74 | |||
| – | |||
| 0.86 | |||
| 0.49 | |||
| 0.76 | |||
| 0.86 | |||
| – | |||
| – | |||
| $0.67 | |||
| $0.59 | |||
| $0.73 | |||
| 67.2 | |||
| 23.6 | |||
| 31.2 | |||
| 275.7 | |||
| 2046.5 | |||
| 0.51 | |||
| 0.23 | |||
| 0.36 | |||
| 0.72 | |||
| 1.48 | |||
|- | |- | ||
| '''[[Llama 3. | | '''[[Llama 3.2 3B]]''' || Meta || Open || 128k || 49 || 1103 || 0.64 || 0.21 || 0.50 || 0.60 || – || – || $0.06 || $0.06 || $0.06 || 202.2 || 42.4 || 144.0 || 543.6 || 1623.1 || 0.38 || 0.15 || 0.26 || 0.49 || 0.93 | ||
| Meta | |||
| Open | |||
| 128k | |||
| | |||
| | |||
| 0. | |||
| 0. | |||
| 0. | |||
| 0. | |||
| – | |||
| – | |||
| $ | |||
| $ | |||
| $ | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| 0. | |||
| 0. | |||
| 0. | |||
| 0. | |||
| | |||
|- | |- | ||
| '''[[ | | '''[[Gemini 1.5 Flash (May)]]''' || Google || Proprietary || 1m || – || 1227 || 0.79 || 0.39 || 0.55 || – || – || – || $0.13 || $0.07 || $0.30 || 310.0 || 276.8 || 297.5 || 325.0 || 350.4 || 0.30 || 0.23 || 0.27 || 0.33 || 0.39 | ||
| | |||
| | |||
| | |||
| | |||
| | |||
| 0. | |||
| 0. | |||
| 0. | |||
| | |||
| – | |||
| – | |||
| $0. | |||
| $0. | |||
| $0. | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| 0. | |||
| 0. | |||
| 0. | |||
| 0. | |||
| | |||
|- | |- | ||
| '''[[ | | '''[[Nova Micro]]''' || Amazon || Proprietary || 130k || 66 || – || 0.76 || 0.38 || 0.69 || 0.80 || – || – || $0.06 || $0.04 || $0.14 || 195.8 || 170.9 || 186.0 || 208.3 || 219.5 || 0.33 || 0.30 || 0.32 || 0.35 || 0.39 | ||
| | |||
| | |||
| | |||
| | |||
| – | |||
| 0. | |||
| 0. | |||
| 0. | |||
| 0.80 | |||
| – | |||
| – | |||
| $0. | |||
| $0. | |||
| $0. | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| 0. | |||
| 0. | |||
| 0. | |||
| 0. | |||
| 0. | |||
|- | |- | ||
| '''[[ | | '''[[DeepSeek-Coder-V2]]''' || DeepSeek || Open || 128k || 71 || 1178 || 0.80 || 0.42 || 0.74 || 0.87 || – || – || $0.17 || $0.14 || $0.28 || 64.4 || 51.8 || 57.3 || 71.4 || 81.1 || 1.12 || 0.84 || 0.99 || 1.27 || 1.71 | ||
| | |||
| Open | |||
| 128k | |||
| | |||
| | |||
| 0. | |||
| 0. | |||
| 0. | |||
| 0. | |||
| – | |||
| – | |||
| $0. | |||
| $0. | |||
| $0. | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| 0. | |||
| 0. | |||
| | |||
| | |||
|- | |- | ||
| '''[[ | | '''[[Phi-4]]''' || Microsoft Azure || Open || 16k || 77 || – || 0.85 || 0.57 || 0.81 || 0.87 || – || – || $0.09 || $0.07 || $0.14 || 85.1 || 76.2 || 82.0 || 85.4 || 85.6 || 0.21 || 0.16 || 0.18 || 0.23 || 0.25 | ||
| | |||
| Open | |||
| | |||
| | |||
| | |||
| 0. | |||
| 0. | |||
| 0. | |||
| 0. | |||
| – | |||
| – | |||
| $0. | |||
| $0. | |||
| $0. | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| 0. | |||
| 0. | |||
| 0. | |||
| 0. | |||
| 0. | |||
|- | |- | ||
| '''[[ | | '''[[Reka Flash]]''' || Reka AI || Proprietary || 128k || 59 || – || 0.73 || 0.34 || 0.53 || 0.74 || – || – || $0.35 || $0.20 || $0.80 || – || – || – || – || – || – || – || – || – || – | ||
| | |||
| | |||
| 128k | |||
| | |||
| | |||
| 0. | |||
| 0. | |||
| 0. | |||
| 0. | |||
| – | |||
| – | |||
| $0. | |||
| $0. | |||
| $0. | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| | |||
|- | |- | ||
| '''[[ | | '''[[OpenChat 3.5]]''' || OpenChat || Open || 8k || 44 || 1076 || 0.56 || 0.22 || 0.31 || 0.68 || – || – || $0.06 || $0.06 || $0.06 || 73.3 || 66.3 || 69.3 || 76.3 || 80.3 || 0.30 || 0.24 || 0.27 || 0.32 || 0.37 | ||
| | |||
| Open | |||
| | |||
| | |||
| | |||
| 0. | |||
| 0. | |||
| 0. | |||
| 0. | |||
| – | |||
| – | |||
| $0. | |||
| $0. | |||
| $0. | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| 0. | |||
| 0. | |||
| 0. | |||
| 0. | |||
| 0. | |||
|- | |- | ||
| '''[[ | | '''[[Jamba Instruct]]''' || AI21 Labs || Proprietary || 256k || – || – || 0.58 || 0.25 || – || – || – || – || $0.55 || $0.50 || $0.70 || 77.1 || 70.4 || 74.3 || 169.6 || 193.7 || 0.52 || 0.29 || 0.45 || 0.54 || 0.58 | ||
| | |||
| | |||
| | |||
| Proprietary | |||
| | |||
| | |||
| | |||
| – | |||
| | |||
| – | |||
| | |||
| 0.58 | |||
| | |||
| 0. | |||
| | |||
| – | |||
| | |||
| – | |||
| | |||
| – | |||
| | |||
| – | |||
| | |||
| $ | |||
| | |||
| $0 | |||
| | |||
| $ | |||
| | |||
| 77. | |||
| | |||
| | |||
| | |||
| 74. | |||
| | |||
| | |||
| | |||
| | |||
| | |||
| 0. | |||
| | |||
| 0. | |||
| | |||
| 0. | |||
| | |||
| 0. | |||
| | |||
| 0.58 | |||
|} | |} | ||
Revision as of 20:38, 13 January 2025
Compare different large language models (LLMs): Concise comparison, detailed comparison, terminology definitions.
Concised Comparison
Model | Creator | Context Window | Quality Index (Normalized avg) |
Blended (USD/1M Tokens) |
Median (Tokens/s) |
Median (First Chunk (s)) |
---|---|---|---|---|---|---|
o1-preview | OpenAI | 128k | 86 | $27.56 | 143.7 | 21.33 |
o1-mini | OpenAI | 128k | 84 | $5.25 | 213.2 | 11.27 |
GPT-4o (Aug '24) | OpenAI | 128k | 78 | $4.38 | 83.5 | 0.67 |
GPT-4o (May '24) | OpenAI | 128k | 78 | $7.50 | 106.3 | 0.65 |
GPT-4o mini | OpenAI | 128k | 73 | $0.26 | 113.8 | 0.64 |
GPT-4o (Nov '24) | OpenAI | 128k | 73 | $4.38 | 116.4 | 0.39 |
GPT-4o mini Realtime (Dec '24) | OpenAI | 128k | $0.00 | |||
GPT-4o Realtime (Dec '24) | OpenAI | 128k | $0.00 | |||
Llama 3.3 70B | Meta | 128k | 74 | $0.69 | 71.8 | 0.49 |
Llama 3.1 405B | Meta | 128k | 74 | $3.50 | 30.2 | 0.71 |
Llama 3.1 70B | Meta | 128k | 68 | $0.72 | 72.8 | 0.44 |
Llama 3.2 90B (Vision) | Meta | 128k | 68 | $0.81 | 48.9 | 0.33 |
Llama 3.2 11B (Vision) | Meta | 128k | 54 | $0.18 | 131.2 | 0.28 |
Llama 3.1 8B | Meta | 128k | 54 | $0.10 | 184.9 | 0.33 |
Llama 3.2 3B | Meta | 128k | 49 | $0.06 | 201.4 | 0.38 |
Llama 3.2 1B | Meta | 128k | 26 | $0.04 | 468.6 | 0.37 |
Gemini 2.0 Flash (exp) | 1m | 82 | $0.00 | 169.0 | 0.48 | |
Gemini 1.5 Pro (Sep) | 2m | 80 | $2.19 | 60.8 | 0.74 | |
Gemini 1.5 Flash (Sep) | 1m | 72 | $0.13 | 188.4 | 0.25 | |
Gemma 2 27B | 8k | 61 | $0.26 | 59.4 | 0.48 | |
Gemma 2 9B | 8k | 55 | $0.12 | 168.9 | 0.36 | |
Gemini 1.5 Flash (May) | 1m | $0.13 | 310.6 | 0.29 | ||
Gemini Experimental (Nov) | 2m | $0.00 | 53.9 | 1.12 | ||
Gemini 1.5 Pro (May) | 2m | $2.19 | 66.9 | 0.49 | ||
Gemini 1.5 Flash-8B | 1m | $0.07 | 279.7 | 0.38 | ||
Claude 3.5 Sonnet (Oct) | Anthropic | 200k | 80 | $6.00 | 72.0 | 0.99 |
Claude 3.5 Sonnet (June) | Anthropic | 200k | 76 | $6.00 | 61.5 | 0.87 |
Claude 3 Opus | Anthropic | 200k | 70 | $30.00 | 25.9 | 2.00 |
Claude 3.5 Haiku | Anthropic | 200k | 68 | $1.60 | 65.1 | 0.71 |
Claude 3 Haiku | Anthropic | 200k | 55 | $0.50 | 121.6 | 0.72 |
Pixtral Large | Mistral | 128k | 74 | $3.00 | 36.5 | 0.40 |
Mistral Large 2 (Jul '24) | Mistral | 128k | 74 | $3.00 | 31.1 | 0.50 |
Mistral Large 2 (Nov '24) | Mistral | 128k | 74 | $3.00 | 37.4 | 0.52 |
Mistral Small (Sep '24) | Mistral | 33k | 61 | $0.30 | 61.5 | 0.32 |
Mixtral 8x22B | Mistral | 65k | 61 | $1.20 | 85.1 | 0.57 |
Pixtral 12B | Mistral | 128k | 56 | $0.13 | 70.3 | 0.37 |
Ministral 8B | Mistral | 128k | 56 | $0.10 | 136.1 | 0.30 |
Mistral NeMo | Mistral | 128k | 54 | $0.09 | 122.5 | 0.48 |
Ministral 3B | Mistral | 128k | 53 | $0.04 | 168.5 | 0.29 |
Mixtral 8x7B | Mistral | 33k | 41 | $0.50 | 110.6 | 0.36 |
Codestral-Mamba | Mistral | 256k | 33 | $0.25 | 95.8 | 0.44 |
Command-R+ | Cohere | 128k | 55 | $5.19 | 50.7 | 0.47 |
Command-R+ (Apr '24) | Cohere | 128k | 45 | $6.00 | 49.3 | 0.51 |
Command-R (Mar '24) | Cohere | 128k | 36 | $0.75 | 108.1 | 0.36 |
Aya Expanse 8B | Cohere | 8k | $0.75 | 165.5 | 0.16 | |
Command-R | Cohere | 128k | $0.51 | 111.8 | 0.32 | |
Aya Expanse 32B | Cohere | 128k | $0.75 | 120.4 | 0.18 | |
Sonar 3.1 Small | Perplexity | 127k | $0.20 | 203.8 | 0.34 | |
Sonar 3.1 Large | Perplexity | 127k | $1.00 | 57.7 | 0.31 | |
Grok Beta | xAI | 128k | 72 | $7.50 | 66.7 | 0.42 |
Nova Pro | Amazon | 300k | 75 | $1.40 | 91.0 | 0.38 |
Nova Lite | Amazon | 300k | 70 | $0.10 | 148.0 | 0.33 |
Nova Micro | Amazon | 130k | 66 | $0.06 | 195.5 | 0.33 |
Phi-4 | Microsoft Azure | 16k | 77 | $0.09 | 85.0 | 0.22 |
Phi-3 Mini | Microsoft Azure | 4k | $0.00 | |||
Phi-3 Medium 14B | Microsoft Azure | 128k | $0.30 | 50.4 | 0.43 | |
Solar Mini | Upstage | 4k | 47 | $0.15 | 89.3 | 1.13 |
DBRX | Databricks | 33k | 46 | $1.16 | 78.3 | 0.42 |
Llama 3.1 Nemotron 70B | NVIDIA | 128k | 72 | $0.27 | 48.3 | 0.57 |
Reka Flash | Reka AI | 128k | 59 | $0.35 | ||
Reka Core | Reka AI | 128k | 58 | $2.00 | ||
Reka Flash (Feb '24) | Reka AI | 128k | 46 | $0.35 | ||
Reka Edge | Reka AI | 128k | 31 | $0.10 | ||
Jamba 1.5 Large | AI21 Labs | 256k | 64 | $3.50 | 51.0 | 0.71 |
Jamba 1.5 Mini | AI21 Labs | 256k | $0.25 | 83.7 | 0.48 | |
DeepSeek V3 | DeepSeek | 128k | 80 | $0.90 | 20.9 | 0.94 |
DeepSeek-V2.5 (Dec '24) | DeepSeek | 128k | 72 | $0.17 | 61.8 | 1.15 |
DeepSeek-Coder-V2 | DeepSeek | 128k | 71 | $0.17 | 62.0 | 1.11 |
DeepSeek-V2.5 | DeepSeek | 128k | $1.09 | 7.6 | 0.77 | |
DeepSeek-V2 | DeepSeek | 128k | $0.17 | |||
Arctic | Snowflake | 4k | 51 | $0.00 | ||
Qwen2.5 72B | Alibaba | 131k | 77 | $0.40 | 67.6 | 0.53 |
Qwen2.5 Coder 32B | Alibaba | 131k | 72 | $0.80 | 84.0 | 0.38 |
Qwen2 72B | Alibaba | 131k | 72 | $0.63 | 46.5 | 0.30 |
QwQ 32B-Preview | Alibaba | 33k | 46 | $0.26 | 67.3 | 0.40 |
Yi-Large | 01.AI | 32k | 61 | $3.00 | 68.1 | 0.47 |
GPT-4 Turbo | OpenAI | 128k | 75 | $15.00 | 43.3 | 1.20 |
GPT-4 | OpenAI | 8k | $37.50 | 28.4 | 0.75 | |
Llama 3 70B | Meta | 8k | 48 | $0.89 | 48.9 | 0.38 |
Llama 3 8B | Meta | 8k | 45 | $0.15 | 117.3 | 0.34 |
Llama 2 Chat 70B | Meta | 4k | $1.85 | |||
Llama 2 Chat 13B | Meta | 4k | $0.00 | |||
Llama 2 Chat 7B | Meta | 4k | $0.33 | 123.7 | 0.37 | |
Gemini 1.0 Pro | 33k | $0.75 | 102.9 | 1.27 | ||
Claude 3 Sonnet | Anthropic | 200k | 57 | $6.00 | 68.2 | 0.76 |
Claude 2.1 | Anthropic | 200k | $12.00 | 14.1 | 1.24 | |
Claude 2.0 | Anthropic | 100k | $12.00 | 29.9 | 0.81 | |
Mistral Small (Feb '24) | Mistral | 33k | 59 | $1.50 | 53.5 | 0.37 |
Mistral Large (Feb '24) | Mistral | 33k | 56 | $6.00 | 38.9 | 0.44 |
Mistral 7B | Mistral | 8k | 28 | $0.16 | 112.5 | 0.26 |
Mistral Medium | Mistral | 33k | $4.09 | 44.7 | 0.37 | |
Codestral | Mistral | 33k | $0.30 | 84.9 | 0.28 | |
OpenChat 3.5 | OpenChat | 8k | 44 | $0.06 | 73.5 | 0.30 |
Jamba Instruct | AI21 Labs | 256k | $0.55 | 77.4 | 0.52 |
Detailed Comparison
Model | Creator | License | Context Window | Quality Index (Normalized avg) |
Chatbot Arena | MMLU | GPQA | MATH-500 | HumanEval | Blended (USD/1M Tokens) |
Input Price (USD/1M Tokens) |
Output Price (USD/1M Tokens) |
Median (Tokens/s) |
P5 (Tokens/s) |
P25 (Tokens/s) |
P75 (Tokens/s) |
P95 (Tokens/s) |
Median (First Chunk (s)) |
P5 (First Chunk (s)) |
P25 (First Chunk (s)) |
P75 (First Chunk (s)) |
P95 (First Chunk (s)) |
Further Analysis | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
o1-preview | OpenAI | Proprietary | 128k | 86 | 1334 | 0.91 | 0.67 | 0.92 | 0.96 | $27.56 | $15.75 | $63.00 | 143.8 | 68.9 | 121.6 | 164.6 | 179.6 | 21.28 | 13.40 | 17.04 | 27.80 | 46.49 | – | |
o1-mini | OpenAI | Proprietary | 128k | 84 | 1308 | 0.85 | 0.58 | 0.95 | 0.97 | $5.25 | $3.00 | $12.00 | 213.6 | 84.0 | 154.8 | 238.0 | 299.4 | 11.75 | 2.44 | 9.40 | 14.43 | 24.03 | – | |
GPT-4o (Aug '24) | OpenAI | Proprietary | 128k | 78 | 1337 | 0.89 | 0.51 | 0.80 | 0.93 | – | – | $4.38 | $2.50 | $10.00 | 85.6 | 40.3 | 61.5 | 109.3 | 143.6 | 0.66 | 0.33 | 0.43 | 0.91 | 1.92 |
GPT-4o (May '24) | OpenAI | Proprietary | 128k | 78 | 1285 | 0.87 | 0.51 | 0.79 | 0.93 | – | – | $7.50 | $5.00 | $15.00 | 106.8 | 53.2 | 82.2 | 126.8 | 142.5 | 0.65 | 0.32 | 0.43 | 0.73 | 1.22 |
GPT-4o mini | OpenAI | Proprietary | 128k | 73 | 1273 | 0.82 | 0.44 | 0.79 | 0.88 | – | – | $0.26 | $0.15 | $0.60 | 121.8 | 50.7 | 74.1 | 179.4 | 206.5 | 0.65 | 0.30 | 0.39 | 0.77 | 0.92 |
Claude 3.5 Sonnet (Oct) | Anthropic | Proprietary | 200k | 80 | 1282 | 0.89 | 0.58 | 0.76 | 0.96 | – | – | $6.00 | $3.00 | $15.00 | 71.8 | 37.6 | 44.8 | 78.0 | 89.6 | 0.98 | 0.68 | 0.78 | 1.36 | 2.23 |
Claude 3.5 Sonnet (June) | Anthropic | Proprietary | 200k | 76 | 1268 | 0.88 | 0.56 | 0.71 | 0.90 | – | – | $6.00 | $3.00 | $15.00 | 61.4 | 41.6 | 49.9 | 78.9 | 91.0 | 0.87 | 0.68 | 0.75 | 1.06 | 1.45 |
Claude 3.5 Haiku | Anthropic | Proprietary | 200k | 68 | – | 0.81 | 0.37 | 0.67 | 0.87 | – | – | $1.60 | $0.80 | $4.00 | 65.1 | 51.1 | 58.6 | 75.4 | 105.1 | 0.71 | 0.54 | 0.64 | 0.93 | 1.20 |
Llama 3.3 70B | Meta | Open | 128k | 74 | – | 0.86 | 0.49 | 0.76 | 0.86 | – | – | $0.67 | $0.59 | $0.73 | 67.2 | 23.6 | 31.2 | 275.7 | 2046.5 | 0.51 | 0.23 | 0.36 | 0.72 | 1.48 |
Llama 3.2 3B | Meta | Open | 128k | 49 | 1103 | 0.64 | 0.21 | 0.50 | 0.60 | – | – | $0.06 | $0.06 | $0.06 | 202.2 | 42.4 | 144.0 | 543.6 | 1623.1 | 0.38 | 0.15 | 0.26 | 0.49 | 0.93 |
Gemini 1.5 Flash (May) | Proprietary | 1m | – | 1227 | 0.79 | 0.39 | 0.55 | – | – | – | $0.13 | $0.07 | $0.30 | 310.0 | 276.8 | 297.5 | 325.0 | 350.4 | 0.30 | 0.23 | 0.27 | 0.33 | 0.39 | |
Nova Micro | Amazon | Proprietary | 130k | 66 | – | 0.76 | 0.38 | 0.69 | 0.80 | – | – | $0.06 | $0.04 | $0.14 | 195.8 | 170.9 | 186.0 | 208.3 | 219.5 | 0.33 | 0.30 | 0.32 | 0.35 | 0.39 |
DeepSeek-Coder-V2 | DeepSeek | Open | 128k | 71 | 1178 | 0.80 | 0.42 | 0.74 | 0.87 | – | – | $0.17 | $0.14 | $0.28 | 64.4 | 51.8 | 57.3 | 71.4 | 81.1 | 1.12 | 0.84 | 0.99 | 1.27 | 1.71 |
Phi-4 | Microsoft Azure | Open | 16k | 77 | – | 0.85 | 0.57 | 0.81 | 0.87 | – | – | $0.09 | $0.07 | $0.14 | 85.1 | 76.2 | 82.0 | 85.4 | 85.6 | 0.21 | 0.16 | 0.18 | 0.23 | 0.25 |
Reka Flash | Reka AI | Proprietary | 128k | 59 | – | 0.73 | 0.34 | 0.53 | 0.74 | – | – | $0.35 | $0.20 | $0.80 | – | – | – | – | – | – | – | – | – | – |
OpenChat 3.5 | OpenChat | Open | 8k | 44 | 1076 | 0.56 | 0.22 | 0.31 | 0.68 | – | – | $0.06 | $0.06 | $0.06 | 73.3 | 66.3 | 69.3 | 76.3 | 80.3 | 0.30 | 0.24 | 0.27 | 0.32 | 0.37 |
Jamba Instruct | AI21 Labs | Proprietary | 256k | – | – | 0.58 | 0.25 | – | – | – | – | $0.55 | $0.50 | $0.70 | 77.1 | 70.4 | 74.3 | 169.6 | 193.7 | 0.52 | 0.29 | 0.45 | 0.54 | 0.58 |
Terms
- Artificial Analysis Quality Index: Average result across our evaluations covering different dimensions of model intelligence. Currently includes MMLU, GPQA, Math & HumanEval. OpenAI o1 model figures are preliminary and are based on figures stated by OpenAI. See methodology for more details.
- Context window: Maximum number of combined input & output tokens. Output tokens commonly have a significantly lower limit (varied by model).
- Output Speed: Tokens per second received while the model is generating tokens (ie. after first chunk has been received from the API for models which support streaming).
- Latency: Time to first token of tokens received, in seconds, after API request sent. For models which do not support streaming, this represents time to receive the completion.
- Price: Price per token, represented as USD per million Tokens. Price is a blend of Input & Output token prices (3:1 ratio).
- Output Price: Price per token generated by the model (received from the API), represented as USD per million Tokens.
- Input Price: Price per token included in the request/message sent to the API, represented as USD per million Tokens.
- Time period: Metrics are 'live' and are based on the past 14 days of measurements, measurements are taken 8 times a day for single requests and 2 times per day for parallel requests.