Prompt engineering for text generation: Difference between revisions

no edit summary
No edit summary
No edit summary
Line 131: Line 131:
Numerous studies have explored how to construct in-context examples to maximize performance. [[Prompt format]], [[training examples]], and [[example order]] can lead to dramatically different performance outcomes, ranging from near-random guessing to near state-of-the-art (SoTA) results.
Numerous studies have explored how to construct in-context examples to maximize performance. [[Prompt format]], [[training examples]], and [[example order]] can lead to dramatically different performance outcomes, ranging from near-random guessing to near state-of-the-art (SoTA) results.


Zhao et al. (2021) investigated [[few-shot classification]] using LLMs, specifically [[GPT-3]]. They identified several biases that contribute to high [[variance]] in performance: (1) majority [[label bias]], (2) [[recency bias]], and (3) [[common token bias]]. To address these [[biases]], they proposed a method to calibrate label probabilities output by the model to be uniform when the input string is N/A.<ref name="”111”">Zhao et al. (2021) Calibrate Before Use: Improving Few-Shot Performance of Language Models arXiv:2102.09690</ref>
Zhao et al. (2021) investigated [[few-shot classification]] using LLMs, specifically [[GPT-3]]. They identified several biases that contribute to high [[variance]] in performance: (1) majority [[label bias]], (2) [[recency bias]], and (3) [[common token bias]]. To address these [[biases]], they proposed a method to calibrate label probabilities output by the model to be uniform when the input string is N/A.<ref name="”111”">Zhao et al. (2021) Calibrate Before Use: Improving Few-Shot Performance of Language Models https://arxiv.org/abs/2102.09690</ref>


====Tips for Example Selection====
====Tips for Example Selection====
=====Semantically Similar Examples=====
=====Semantically Similar Examples=====
Liu et al. (2021) suggested choosing examples that are semantically similar to the test example by employing nearest neighbor (NN) clustering in the embedding space.
Liu et al. (2021) suggested choosing examples that are semantically similar to the test example by employing [[k-nearest neighbors]] (KNN) clustering in the [[embedding space]].


=====Diverse and Representative Examples=====
=====Diverse and Representative Examples=====
Su et al. (2022) proposed a graph-based approach to select a diverse and representative set of examples: (1) construct a directed graph based on the cosine similarity between samples in the embedding space (e.g., using SBERT or other embedding models), and (2) start with a set of selected samples and a set of remaining samples, scoring each sample to encourage diverse selection.
Su et al. (2022) proposed a [[graph-based approach]] to select a diverse and representative set of examples: (1) construct a directed graph based on the cosine similarity between samples in the embedding space (e.g., using [[SBERT]] or other [[embedding models]]), and (2) start with a set of selected samples and a set of remaining samples, scoring each sample to encourage [[diverse selection]].


=====Embeddings via Contrastive Learning=====
=====Embeddings via Contrastive Learning=====
Rubin et al. (2022) suggested training embeddings through contrastive learning specific to one training dataset for in-context learning sample selection. This approach measures the quality of an example based on a conditioned probability assigned by the language model.
Rubin et al. (2022) suggested training embeddings through [[contrastive learning]] specific to one [[training dataset]] for in-context learning sample selection. This approach measures the quality of an example based on a conditioned probability assigned by the language model.


=====Q-Learning=====
=====Q-Learning=====
Zhang et al. (2022) explored using Q-Learning for sample selection in LLM training.
Zhang et al. (2022) explored using [[Q-Learning]] for sample selection in LLM training.


=====Uncertainty-Based Active Learning=====
=====Uncertainty-Based Active Learning=====
Diao et al. (2023) proposed identifying examples with high disagreement or entropy among multiple sampling trials based on uncertainty-based active learning. These examples can then be annotated and used in few-shot prompts.
Diao et al. (2023) proposed identifying examples with [[high disagreement]] or [[entropy]] among multiple sampling trials based on [[uncertainty-based active learning]]. These examples can then be annotated and used in few-shot prompts.


==Tips for Example Ordering==
==Tips for Example Ordering==
370

edits