LLM Benchmarks Timeline
2019
Benchmark | Category | Time Span | Date Created | Date Defeated | Killed By | Defeated By | Original Score | Final Score | Links | Details |
---|---|---|---|---|---|---|---|---|---|---|
SuperGLUE | Language | 2019-05 – 2019-10 | 2019-05 | 2019-10 | Saturation | T5 | Human: 89.8% | T5: 89.3% | Paper, Website | More challenging language understanding tasks (word sense, causal reasoning, RC). |