LLaMA/Model Card: Difference between revisions

LLaMA/Model Card (view source)

3 bytes added , 24 February 2023

7,785

edits

@@ Line 31: / Line 31: @@
 Model performance measures We use the following measure to evaluate the model:
-Accuracy for common sense reasoning, reading comprehension, natural language understanding (MMLU), BIG-bench hard, WinoGender and CrowS-Pairs,
+*Accuracy for common sense reasoning, reading comprehension, natural language understanding (MMLU), BIG-bench hard, WinoGender and CrowS-Pairs,
-Exact match for question answering,
+*Exact match for question answering,
-The toxicity score from Perspective API on RealToxicityPrompts.
+*The toxicity score from Perspective API on RealToxicityPrompts.
 Decision thresholds Not applicable.