Interface administrators, Administrators (Semantic MediaWiki), Curators (Semantic MediaWiki), Editors (Semantic MediaWiki), Suppressors, Administrators
7,785
edits
No edit summary |
No edit summary |
||
Line 1: | Line 1: | ||
In order to [[How to write articles to be undetectable by AI content detectors|not have articles being flagged]] as [[AI generated content]]. The text should have high [[perplexity]]. | In order to [[How to write articles to be undetectable by AI content detectors|not have articles being flagged]] as [[AI generated content]]. The text should have high [[perplexity]]. | ||
[[Perplexity]] measures the amount of information contained in a | [[Perplexity]] measures the amount of information contained in a text. Despite the high redundancy of natural language, which allows us to communicate effectively even in noisy environments, some information can still be lost without compromising overall understanding. However, external factors such as alliteration, which adds an additional layer of meaning, can contribute to high perplexity that AI may not be able to recognize. | ||
AI, based on probabilistic models, struggles with text that contains high variability in terms of word specificity, such as legal documents that require precise use of specific keywords and phrases. | AI, based on probabilistic models, struggles with text that contains high variability in terms of word specificity, such as legal documents that require precise use of specific keywords and phrases. | ||
Generative AI for text operates by predicting the most likely word to follow a given word. In a text with perfect perplexity, the AI would always make incorrect predictions. | Generative AI for text operates by predicting the most likely word to follow a given word. In a text with perfect perplexity, the AI would always make incorrect predictions. | ||
==Perplexity in language models== | |||
{{see also|language models}} | |||
Perplexity is a metric used to evaluate the performance of a [[language model]]. It is defined as the inverse probability of the test set normalized by the number of words, or as the average number of bits needed to encode one word using [[cross-entropy]]. Perplexity can be interpreted as the weighted branching factor, with a higher perplexity indicating more confusion in the model's prediction of the next word. |