Jump to content

Large language models ranking: Difference between revisions

no edit summary
No edit summary
No edit summary
Line 1: Line 1:
Ranking of [[LLMs]].
Ranking of [[LLMs]].
{| class="wikitable"
{| class="wikitable"
! Model
! [[Model]]
! ⭐ Arena Elo rating
! ⭐ Arena Elo rating
! 📈 MT-bench (score)
! 📈 MT-bench (score)
Line 7: Line 8:
! License
! License
|-
|-
| GPT-4-Turbo
| [[GPT-4-Turbo]]
| 1210
| 1210
| 9.32
| 9.32
Line 13: Line 14:
| Proprietary
| Proprietary
|-
|-
| GPT-4
| [[GPT-4]]
| 1159
| 1159
| 8.99
| 8.99
Line 19: Line 20:
| Proprietary
| Proprietary
|-
|-
| Claude-1
| [[Claude-1]]
| 1146
| 1146
| 7.9
| 7.9
Line 25: Line 26:
| Proprietary
| Proprietary
|-
|-
| Claude-2
| [[Claude-2]]
| 1125
| 1125
| 8.06
| 8.06
Line 31: Line 32:
| Proprietary
| Proprietary
|-
|-
| Claude-instant-1
| [[Claude-instant-1]]
| 1106
| 1106
| 7.85
| 7.85
Line 37: Line 38:
| Proprietary
| Proprietary
|-
|-
| GPT-3.5-turbo
| [[GPT-3.5-turbo]]
| 1103
| 1103
| 7.94
| 7.94
Line 43: Line 44:
| Proprietary
| Proprietary
|-
|-
| WizardLM-70b-v1.0
| [[WizardLM-70b-v1.0]]
| 1093
| 1093
| 7.71
| 7.71
Line 49: Line 50:
| Llama 2 Community
| Llama 2 Community
|-
|-
| Vicuna-33B
| [[Vicuna-33B]]
| 1090
| 1090
| 7.12
| 7.12
Line 55: Line 56:
| Non-commercial
| Non-commercial
|-
|-
| OpenChat-3.5
| [[OpenChat-3.5]]
| 1070
| 1070
| 7.81
| 7.81
Line 61: Line 62:
| Apache-2.0
| Apache-2.0
|-
|-
| Llama-2-70b-chat
| [[Llama-2-70b-chat]]
| 1065
| 1065
| 6.86
| 6.86
Line 67: Line 68:
| Llama 2 Community
| Llama 2 Community
|-
|-
| WizardLM-13b-v1.2
| [[WizardLM-13b-v1.2]]
| 1047
| 1047
| 7.2
| 7.2
Line 73: Line 74:
| Llama 2 Community
| Llama 2 Community
|-
|-
| zephyr-7b-beta
| [[zephyr-7b-beta]]
| 1042
| 1042
| 7.34
| 7.34
Line 79: Line 80:
| MIT
| MIT
|-
|-
| MPT-30B-chat
| [[MPT-30B-chat]]
| 1031
| 1031
| 6.39
| 6.39
Line 85: Line 86:
| CC-BY-NC-SA-4.0
| CC-BY-NC-SA-4.0
|-
|-
| Vicuna-13B
| [[Vicuna-13B]]
| 1031
| 1031
| 6.57
| 6.57
Line 91: Line 92:
| Llama 2 Community
| Llama 2 Community
|-
|-
| QWen-Chat-14B
| [[QWen-Chat-14B]]
| 1030
| 1030
| 6.96
| 6.96
Line 97: Line 98:
| Qianwen LICENSE
| Qianwen LICENSE
|-
|-
| falcon-180b-chat
| [[falcon-180b-chat]]
| 1024
| 1024
|  
|  
Line 103: Line 104:
| Falcon-180B TII License
| Falcon-180B TII License
|-
|-
| zephyr-7b-alpha
| [[zephyr-7b-alpha]]
| 1024
| 1024
| 6.88
| 6.88
Line 109: Line 110:
| MIT
| MIT
|-
|-
| CodeLlama-34B-instruct
| [[CodeLlama-34B-instruct]]
| 1022
| 1022
|  
|  
Line 115: Line 116:
| Llama 2 Community
| Llama 2 Community
|-
|-
| Guanaco-33B
| [[Guanaco-33B]]
| 1021
| 1021
| 6.53
| 6.53
Line 121: Line 122:
| Non-commercial
| Non-commercial
|-
|-
| Llama-2-13b-chat
| [[Llama-2-13b-chat]]
| 1021
| 1021
| 6.65
| 6.65
Line 127: Line 128:
| Llama 2 Community
| Llama 2 Community
|-
|-
| Mistral-7B-Instruct-v0.1
| [[Mistral-7B-Instruct-v0.1]]
| 1008
| 1008
| 6.84
| 6.84
Line 133: Line 134:
| Apache 2.0
| Apache 2.0
|-
|-
| Llama-2-7b-chat
| [[Llama-2-7b-chat]]
| 1001
| 1001
| 6.27
| 6.27
Line 139: Line 140:
| Llama 2 Community
| Llama 2 Community
|-
|-
| Vicuna-7B
| [[Vicuna-7B]]
| 994
| 994
| 6.17
| 6.17
Line 145: Line 146:
| Llama 2 Community
| Llama 2 Community
|-
|-
| PaLM-Chat-Bison-001
| [[PaLM-Chat-Bison-001]]
| 991
| 991
| 6.4
| 6.4
Line 151: Line 152:
| Proprietary
| Proprietary
|-
|-
| ChatGLM3-6B
| [[ChatGLM3-6B]]
| 970
| 970
|  
|  
Line 157: Line 158:
| Apache-2.0
| Apache-2.0
|-
|-
| Koala-13B
| [[Koala-13B]]
| 955
| 955
| 5.35
| 5.35
Line 163: Line 164:
| Non-commercial
| Non-commercial
|-
|-
| GPT4All-13B-Snoozy
| [[GPT4All-13B-Snoozy]]
| 925
| 925
| 5.41
| 5.41
Line 169: Line 170:
| Non-commercial
| Non-commercial
|-
|-
| MPT-7B-Chat
| [[MPT-7B-Chat]]
| 918
| 918
| 5.42
| 5.42
Line 175: Line 176:
| CC-BY-NC-SA-4.0
| CC-BY-NC-SA-4.0
|-
|-
| ChatGLM2-6B
| [[ChatGLM2-6B]]
| 918
| 918
| 4.96
| 4.96
Line 181: Line 182:
| Apache-2.0
| Apache-2.0
|-
|-
| RWKV-4-Raven-14B
| [[RWKV-4-Raven-14B]]
| 915
| 915
| 3.98
| 3.98
Line 187: Line 188:
| Apache 2.0
| Apache 2.0
|-
|-
| Alpaca-13B
| [[Alpaca-13B]]
| 893
| 893
| 4.53
| 4.53
Line 193: Line 194:
| Non-commercial
| Non-commercial
|-
|-
| OpenAssistant-Pythia-12B
| [[OpenAssistant-Pythia-12B]]
| 884
| 884
| 4.32
| 4.32
Line 199: Line 200:
| Apache 2.0
| Apache 2.0
|-
|-
| ChatGLM-6B
| [[ChatGLM-6B]]
| 871
| 871
| 4.5
| 4.5
Line 205: Line 206:
| Non-commercial
| Non-commercial
|-
|-
| FastChat-T5-3B
| [[FastChat-T5-3B]]
| 863
| 863
| 3.04
| 3.04
Line 211: Line 212:
| Apache 2.0
| Apache 2.0
|-
|-
| StableLM-Tuned-Alpha-7B
| [[StableLM-Tuned-Alpha-7B]]
| 833
| 833
| 2.75
| 2.75
Line 217: Line 218:
| CC-BY-NC-SA-4.0
| CC-BY-NC-SA-4.0
|-
|-
| Dolly-V2-12B
| [[Dolly-V2-12B]]
| 810
| 810
| 3.28
| 3.28
Line 223: Line 224:
| MIT
| MIT
|-
|-
| LLaMA-13B
| [[LLaMA-13B]]
| 789
| 789
| 2.61
| 2.61
Line 229: Line 230:
| Non-commercial
| Non-commercial
|-
|-
| WizardLM-30B
| [[WizardLM-30B]]
|  
|  
| 7.01
| 7.01
Line 235: Line 236:
| Non-commercial
| Non-commercial
|-
|-
| Vicuna-13B-16k
| [[Vicuna-13B-16k]]
|  
|  
| 6.92
| 6.92
Line 241: Line 242:
| Llama 2 Community
| Llama 2 Community
|-
|-
| WizardLM-13B-v1.1
| [[WizardLM-13B-v1.1]]
|  
|  
| 6.76
| 6.76
Line 247: Line 248:
| Non-commercial
| Non-commercial
|-
|-
| Tulu-30B
| [[Tulu-30B]]
|  
|  
| 6.43
| 6.43
Line 253: Line 254:
| Non-commercial
| Non-commercial
|-
|-
| Guanaco-65B
| [[Guanaco-65B]]
|  
|  
| 6.41
| 6.41
Line 259: Line 260:
| Non-commercial
| Non-commercial
|-
|-
| OpenAssistant-LLaMA-30B
| [[OpenAssistant-LLaMA-30B]]
|  
|  
| 6.41
| 6.41
Line 265: Line 266:
| Non-commercial
| Non-commercial
|-
|-
| WizardLM-13B-v1.0
| [[WizardLM-13B-v1.0]]
|  
|  
| 6.35
| 6.35
Line 271: Line 272:
| Non-commercial
| Non-commercial
|-
|-
| Vicuna-7B-16k
| [[Vicuna-7B-16k]]
|  
|  
| 6.22
| 6.22
Line 277: Line 278:
| Llama 2 Community
| Llama 2 Community
|-
|-
| Baize-v2-13B
| [[Baize-v2-13B]]
|  
|  
| 5.75
| 5.75
Line 283: Line 284:
| Non-commercial
| Non-commercial
|-
|-
| XGen-7B-8K-Inst
| [[XGen-7B-8K-Inst]]
|  
|  
| 5.55
| 5.55
Line 289: Line 290:
| Non-commercial
| Non-commercial
|-
|-
| Nous-Hermes-13B
| [[Nous-Hermes-13B]]
|  
|  
| 5.51
| 5.51
Line 295: Line 296:
| Non-commercial
| Non-commercial
|-
|-
| MPT-30B-Instruct
| [[MPT-30B-Instruct]]
|  
|  
| 5.22
| 5.22
Line 301: Line 302:
| CC-BY-SA 3.0
| CC-BY-SA 3.0
|-
|-
| Falcon-40B-Instruct
| [[Falcon-40B-Instruct]]
|  
|  
| 5.17
| 5.17
Line 307: Line 308:
| Apache 2.0
| Apache 2.0
|-
|-
| H2O-Oasst-OpenLLaMA-13B
| [[H2O-Oasst-OpenLLaMA-13B]]
|  
|  
| 4.63
| 4.63
223

edits