Category
Code generation
12 articles
AlphaCode
Artificial Intelligence, DeepMind, Large Language Models
CRUXEval
AI Benchmarks, Machine Learning, Natural Language Processing
Claude Sonnet 4.5
2025 software, Anthropic, Artificial Intelligence
Code Llama
Large Language Models, Meta AI, Open Source AI
CodeContests
AI Benchmarks, Competitive Programming, Machine Learning
GitHub Copilot
AI Tools, Developer Tools, Microsoft
HumanEval
Benchmarks
LiveCodeBench
AI Benchmarks, Machine Learning
MBPP
Benchmarks, Large Language Models, Machine Learning
SWE-bench
AI Agents, Benchmarks
SWE-bench Verified
AI Evaluation, Benchmarks
StarCoder
Hugging Face, Large Language Models, Open Source AI