The free encyclopedia of artificial intelligence that anyone can edit.
2,378 articles in English · 913 categories
Humanoid robots Humanoid robots are robots designed to resemble and, in some cases, mimic the human body in shape and movement. Unlike industrial robotic arms...
BoolQ (Boolean Questions) is a natural language processing benchmark dataset designed for yes/no question answering. Created by researchers at Google, BoolQ...
PIQA (Physical Interaction Question Answering) is a benchmark dataset designed to evaluate the physical commonsense reasoning abilities of natural language...
CRUXEval (Code Reasoning, Understanding, and eXecution Evaluation) is a benchmark designed to measure how well large language models can reason about,...
PubMedQA is a biomedical question answering dataset and benchmark designed to evaluate the ability of machine learning models to answer research questions...