LLM Benchmarks Timeline

From AI Wiki
Revision as of 16:47, 10 January 2025 by Alpha5 (talk | contribs) (Replaced content with "=== 2019 === {| class="wikitable" |- ! Benchmark ! Category ! Time Span ! Date Created ! Date Defeated ! Killed By ! Defeated By ! Original Score ! Final Score ! Links ! Details |- | '''SuperGLUE''' | Language | 2019-05 – 2019-10 | 2019-05 | 2019-10 | Saturation | T5 | Human: 89.8% | T5: 89.3% | [https://arxiv.org/abs/1905.00537 Paper], [https://super.gluebenchmark.com/ Website] | More challenging language understanding tasks (word sense, causal reasoning, RC). |}")

2019

Benchmark Category Time Span Date Created Date Defeated Killed By Defeated By Original Score Final Score Links Details
SuperGLUE Language 2019-05 – 2019-10 2019-05 2019-10 Saturation T5 Human: 89.8% T5: 89.3% Paper, Website More challenging language understanding tasks (word sense, causal reasoning, RC).