Prompt engineering for text generation: Difference between revisions

Prompt engineering for text generation (view source)

69 bytes added , 5 April 2023

370

edits

@@ Line 131: / Line 131: @@
 Numerous studies have explored how to construct in-context examples to maximize performance. [[Prompt format]], [[training examples]], and [[example order]] can lead to dramatically different performance outcomes, ranging from near-random guessing to near state-of-the-art (SoTA) results.
-Zhao et al. (2021) investigated [[few-shot classification]] using LLMs, specifically [[GPT-3]]. They identified several biases that contribute to high [[variance]] in performance: (1) majority [[label bias]], (2) [[recency bias]], and (3) [[common token bias]]. To address these [[biases]], they proposed a method to calibrate label probabilities output by the model to be uniform when the input string is N/A.<ref name="11">https://arxiv.org/abs/2102.09690</ref>
+Zhao et al. (2021) investigated [[few-shot classification]] using LLMs, specifically [[GPT-3]]. They identified several biases that contribute to high [[variance]] in performance: (1) majority [[label bias]], (2) [[recency bias]], and (3) [[common token bias]]. To address these [[biases]], they proposed a method to calibrate label probabilities output by the model to be uniform when the input string is N/A.<ref name="Calibrate Before Use: Improving Few-Shot Performance of Language Models">https://arxiv.org/abs/2102.09690</ref>
 ====Tips for Example Selection====