Perplexity: Difference between revisions

189 bytes added ,  31 January 2023
no edit summary
No edit summary
No edit summary
Line 9: Line 9:
==Perplexity in language models==
==Perplexity in language models==
{{see also|language models}}
{{see also|language models}}
Perplexity is a metric used to evaluate the performance of a [[language model]]. It is defined as the inverse probability of the test set normalized by the number of words, or as the average number of bits needed to encode one word using [[cross-entropy]]. Perplexity can be interpreted as the weighted branching factor, with a higher perplexity indicating more confusion in the model's prediction of the next word.
Perplexity is an important measurement for determining how good a [[language model]] is. Essentially, it quantifies the quality of the model's predictions by evaluating the inverse probability of the [[test set]], normalized by the number of words, or by calculating the average number of bits required to encode a single word through [[cross-entropy]]. Perplexity can be perceived as the [[weighted branching factor]], and a high perplexity score represents a higher degree of confusion in the model's next-word predictions, while a low perplexity score implies greater confidence in the model's output.