See also: Guides, ChatGPT Guides and Prompt Engineering Guides
The perplexity, burstiness, professionalism, randomness, and sentimentality guide is a prompt engineering pattern that asks ChatGPT to write according to five style sliders, each rated 1 to 10, so a user can dial the tone and texture of the output up or down. Despite the name, it is a prompt template, not real model fine-tuning: no weights change and no training data is added, and the word "fine-tune" is used loosely. The user simply supplies a numbered set of style instructions and asks the model to honor them on each turn.
This pattern circulated widely on YouTube and X in early 2023 as users tried to make AI output sound more varied and "human." Two of the five sliders, perplexity and burstiness, borrow the names of metrics used by AI text detectors such as GPTZero and Originality.ai. The other three, professionalism, randomness, and sentimentality, are informal style knobs with no formal measurement behind them. It is worth stating clearly that AI text detection is widely documented as unreliable and prone to false positives, so this guide presents the concepts as a way to steer writing style, not as a reliable way to evade detection.
What does this prompt actually do?
The prompt declares a scale from 1 to 10 for each of the five parameters, sets a default value of 5, and asks the model to list its current settings at the top of every reply. After the parameter list the user includes a confirmation line such as "Please confirm that you understand the parameters of this prompt by responding with 'I understand.'" Once the model confirms, the user can ask for an essay, a product description, a story, or any other piece of writing, and the model is supposed to honor the parameter values in its output.
In practice the technique works because large language models follow system-style instructions reasonably well within a single conversation. They do not always hit the requested numbers precisely, especially at the extremes, but they do shift their style in the requested direction. A perplexity setting of 9 makes the assistant reach for harder vocabulary. A burstiness setting of 9 makes it mix one-word sentences with rolling, clause-stuffed ones. A sentimentality setting of 1 strips out feeling words. The table below summarizes the five sliders and what each one controls.
| Parameter | What it controls | A real metric? | Low setting (1-2) | High setting (9-10) |
|---|
| Perplexity | Predictability of word choice | Yes, used by detectors | Short, common, predictable words | Rare vocabulary, unusual phrasing |
| Burstiness | Variation in sentence length and structure | Yes, used by detectors | Uniform sentence rhythm | Short fragments next to long clauses |
| Professionalism | Register or formality | No, informal style knob | Slang, contractions, casual filler | Formal, distant, white-paper tone |
| Randomness | Apparent unpredictability of ideas | No, mimics sampling temperature | Focused, on topic | Drifting, surprising, possibly incoherent |
| Sentimentality | Emotional warmth | No, informal style knob | Neutral, reportorial | Feeling words, metaphor, first-person reaction |
What is perplexity in writing?
Perplexity is a real term from natural language processing that measures how well a language model predicts a sequence of text. For a model and a sequence of tokens, perplexity is the exponentiated average negative log-likelihood of that sequence. In plain English, it measures how surprised the model is by the text: a sequence the model finds very likely has low perplexity, while a sequence full of rare words and odd transitions has high perplexity. Lower perplexity means more predictable text. The formula given by Hugging Face's documentation is PPL(X) = exp(-1/t sum log p(x_i | x_<i)).[5]
AI detectors built on top of this idea, including GPTZero, treat low perplexity as a signal of machine authorship.[1] The reasoning is that a model trained to maximize probability tends to produce text that other models also find probable, so AI text tends to be "low-perplexity." Human writers reach for slang, unusual word choices, and idioms that any single model would not have predicted, so their average perplexity sits higher. This is an intuition, not a guarantee, and as the limitations section below explains it misfires often in both directions.
When the prompt sets perplexity to 1, the assistant writes with short common words and predictable phrasing. At 10 it pulls in latinate vocabulary, uncommon idioms, and longer noun phrases. There is no guarantee the resulting score will actually rise on any given detector, only that the surface style will look more varied.
What is burstiness?
Burstiness, in this context, is the variation and clustering in sentence length and structure. The term has two roots in computer science. In information retrieval it dates to work by Slava Katz in the mid 1990s on the observation that once a word appears in a document it is more likely to appear again. In streaming and topic detection it traces to Jon Kleinberg's 2003 paper "Bursty and Hierarchical Structure in Streams," which models sudden spikes in word frequency.[7]
GPTZero borrowed the word for a related but distinct purpose: it looks at how much perplexity varies across the sentences of a passage.[1] A long passage of equally surprising sentences is treated as low burstiness. A passage that mixes a low-perplexity opener with a high-perplexity aside is treated as high burstiness. According to GPTZero, burstiness is the metric that lets the tool reason about long-form context rather than only individual sentences.[2]
Human writing tends to be bursty because writers slow down for setup, speed up for emphasis, and break rhythm on purpose. AI writing tends to be flatter because models trained on cross-entropy loss converge on a comfortable middle band. Asking ChatGPT for a burstiness of 9 nudges it toward the human pattern: short sentences next to long ones, sometimes a one-word fragment, sometimes a clause that runs for half a paragraph.
What is the professionalism setting?
Professionalism is not a detection metric. It is an informal style axis, similar to formality in linguistics. At a setting of 1 the assistant uses contractions, slang, hedge words, and conversational filler. At 10 it removes contractions, prefers passive constructions for distance, and reaches for the kind of vocabulary that appears in white papers and regulatory filings. The setting effectively chooses where on the register scale the output should sit.
This parameter overlaps with the system prompt idea of giving the model a persona. Asking for professionalism 10 is roughly the same as telling the model to write like a corporate lawyer. Asking for professionalism 2 is roughly the same as telling it to write like a friend texting on a Saturday afternoon. The numbered scale just makes it easier for users to dial the register up or down without rewriting the persona each time.
What does the randomness setting do?
Randomness is the trickiest of the five because it collides with how the OpenAI API already works. The chat models expose two real sampling controls, temperature and top-p. Temperature scales the logits before softmax and ranges from 0 to 2 by default. Higher temperature flattens the probability distribution and produces more varied output. Top-p, also called nucleus sampling, restricts the next-token draw to the smallest set of tokens whose cumulative probability exceeds p. Lower top-p means more focused output.[8]
The randomness slider in this prompt does not actually change temperature or top-p. The model has no way to alter its own sampling settings mid-conversation. What it does instead is mimic the perceived effect of higher temperature: it picks less obvious next words, swerves into unexpected topics, and lets sentences drift. At a setting of 10 the prompt typically warns that output may be "almost completely random and nonsensical," which matches what real temperature 2 output tends to look like.
For users who have access to the API rather than the consumer ChatGPT interface, the more reliable way to get the randomness effect is to set temperature directly. Useful values for creative writing tend to sit between 0.7 and 1.2. Anything above 1.5 produces text that drifts off topic and contradicts itself.
What is the sentimentality setting?
Sentimentality is the emotional dial, and like professionalism and randomness it is an informal style knob rather than a measured score. At 1 the output is neutral and reportorial. At 10 it leans on adjectives of feeling, includes first-person reactions, and reaches for metaphors of warmth, loss, and longing. This parameter draws on sentiment analysis intuition without using a formal sentiment score. The model knows what emotionally loaded writing looks like because it has seen plenty of it in training data, so the slider works through stylistic mimicry rather than measurement.
Users writing marketing copy often set sentimentality to 6 or 7 to add warmth without melodrama. Users writing technical documentation usually set it to 1 or 2 to keep the prose factual. Users writing fiction or personal essays push it higher.
How do you prompt ChatGPT for these styles?
The most common version of the prompt looks something like this in spirit, though specific wording varies across the YouTube tutorials that popularized it:
In order to generate a text response, please adhere to the following parameters.
Each parameter is set on a scale from 1 to 10, where a higher value represents
more of the specified attribute. Include the current values at the top of your
response in a bulleted list before the actual text.
Perplexity: (1-10)
Burstiness: (1-10)
Professionalism: (1-10)
Randomness: (1-10)
Sentimentality: (1-10)
Default each parameter to 5. Confirm that you understand by responding
with "I understand." Then wait for the topic.
After the confirmation, the user supplies the actual request. Example: "Write a 400 word blog introduction about home espresso machines. Perplexity 8, burstiness 9, professionalism 4, randomness 5, sentimentality 6." The assistant then produces a draft with those settings echoed at the top of the reply. Because the model tends to drift back toward its default style over a long exchange, many workflow guides recommend pasting the parameter values at the start of every request rather than relying on the model to remember them.
Why did this prompt get popular?
GPTZero went public in January 2023, built by Princeton undergraduate Edward Tian, who tweeted a beta on January 2 and launched the tool publicly around January 3.[3] Within its first week the tool logged about 30,000 uses and crashed once before its hosting platform gave it more capacity.[3] By July 2024 GPTZero reported around 4 million users, up from roughly 1 million a year earlier.[3] As schools and editors started pasting student work and freelance submissions into the checker, students and content writers began looking for ways to keep their AI-assisted drafts from being flagged.
The perplexity and burstiness prompt pattern was the first widely shared answer. It made intuitive sense because it named the same metrics the detector was using. Early 2023 YouTube tutorials from creators in the SEO and AI writing space spread the template fast, and copycat prompts spawned variations that added randomness, sentimentality, professionalism, sarcasm, formality, and other style axes.
Do AI detectors actually work?
Whether any prompt reliably defeats detectors is the wrong question, because the detectors themselves are widely documented as unreliable. OpenAI offers the clearest admission. The company launched its own AI Text Classifier on January 31, 2023, then quietly discontinued it on July 20, 2023, citing a "low rate of accuracy."[10][11] In its launch evaluation the classifier correctly flagged only 26 percent of AI-written text as "likely AI-generated" while mislabeling 9 percent of human text as AI.[12] OpenAI's closing note stated, "We are working to incorporate feedback and are currently researching more effective provenance techniques for text."[10]
Independent research reaches similar conclusions. A 2023 Stanford study led by Weixin Liang, published in Patterns, ran seven widely used GPT detectors over 91 TOEFL essays written by non-native English speakers and 88 essays by US eighth graders. The detectors classified the US student essays accurately but mislabeled the non-native essays as AI-generated at an average false positive rate of 61.3 percent, and all seven flagged at least one human-written TOEFL essay.[9] The authors attribute the bias to the lower text perplexity of non-native writing, the very signal the detectors rely on. The same study found that asking ChatGPT to enrich the vocabulary of those essays cut the false positive rate from about 61 percent to roughly 12 percent, which shows how easily the perplexity signal can be moved in either direction.[9]
The practical takeaway is that detectors miss in both directions. A well-tuned prompt can produce text that passes them, and a perfectly genuine human draft can also fail them. Acting on a detector's verdict, especially to accuse a writer of cheating, is not supported by the documented accuracy of these tools.
What are the limitations of this technique?
The biggest limitation is that the model does not actually measure perplexity or burstiness on its own output. It estimates the style it thinks would produce those scores. A user who pastes the result into a detector may find that the score moved only slightly. The shift in surface style is real, but the underlying token probabilities still come from the same base model, and a fine-tuned classifier can still recognize patterns.
A second limitation is consistency. Across a long document the model tends to drift back toward its comfortable middle band. Users who run the prompt across many turns often have to remind the assistant of the current parameter values every few exchanges.
A third limitation is that the most aggressive settings produce writing that humans also find odd. Perplexity 10 reads like a thesaurus with no taste. Burstiness 10 reads like a draft no one has revised. Randomness 9 produces sentences that contradict each other. The technique is most useful in the middle of the dial, with one or two parameters pushed up a notch, rather than at the extremes.
How does this relate to actual fine-tuning?
Real fine-tuning, in the sense OpenAI uses the term, involves additional training on a curated dataset to shift the model's weights toward a target style or domain. It changes how the model behaves at the parameter level, not just within a conversation. The five-parameter prompt does none of that. It is a style instruction, similar to telling the model to write in the voice of a particular author.
For users who genuinely need durable style control, the OpenAI fine-tuning API lets developers train a custom model on input-output pairs. A few hundred examples of the target voice is often enough to bias the model in that direction. The prompt-based approach is cheaper and instant, but it lives and dies with each conversation. Closing the chat resets the parameters, and a long enough document will start to slip out of the requested style.
ELI5
Imagine ChatGPT is a kid telling a story, and you have five dials you can turn. One dial makes it use fancy, surprising words instead of plain ones. One dial makes it mix tiny sentences with really long ones so the story does not sound flat. One dial makes it sound like a serious grown-up in a suit or a friend goofing around. One dial makes it wander off into weird unexpected ideas. And one dial makes it sappy and full of feelings or cold and just-the-facts. You set each dial from 1 to 10 and the kid tries to tell the story that way. The dials do not change the kid's brain, they just tell the kid how to act. And the gadgets that try to guess whether a kid or a robot wrote something are not very good at it, so you should not trust them too much.
takeaways
The five-parameter prompt is a useful shorthand for getting ChatGPT to write in a more controlled style. Perplexity and burstiness are the two parameters whose names come from real detector metrics; professionalism, randomness, and sentimentality are informal style axes the model handles through mimicry. The technique nudges output in the requested direction but does not literally measure or guarantee any numerical score. Used in moderation it produces livelier, more varied prose. Used at the extremes it produces text that is harder for humans to read. And because AI text detectors are documented as unreliable, this guide is best understood as a way to control writing style, not as a way to reliably beat a detector.
See also
References
- GPTZero, "What is perplexity & burstiness for AI detection?" gptzero.me/news/perplexity-and-burstiness-what-is-it/
- GPTZero, "How do AI detectors work?" gptzero.me/news/how-ai-detectors-work/
- Wikipedia, "GPTZero," en.wikipedia.org/wiki/GPTZero
- Wikipedia, "Perplexity," en.wikipedia.org/wiki/Perplexity
- Hugging Face Transformers documentation, "Perplexity of fixed-length models," huggingface.co/docs/transformers/perplexity
- Originality.ai, "Perplexity and Burstiness in Writing," originality.ai/blog/perplexity-and-burstiness-in-writing
- Jon Kleinberg, "Bursty and Hierarchical Structure in Streams," Data Mining and Knowledge Discovery 7, 373-397 (2003)
- OpenAI Developer Community, "Cheat Sheet: Mastering Temperature and Top_p in ChatGPT API," community.openai.com
- Weixin Liang et al., "GPT detectors are biased against non-native English writers," Patterns 4(7), 100779 (2023), cell.com/patterns/fulltext/S2666-3899(23)00130-7
- OpenAI, "New AI classifier for indicating AI-written text" (with July 20, 2023 discontinuation note), openai.com/index/new-ai-classifier-for-indicating-ai-written-text/
- TechCrunch, "OpenAI scuttles AI-written text detector over 'low rate of accuracy,'" July 25, 2023, techcrunch.com/2023/07/25/openai-scuttles-ai-written-text-detector-over-low-rate-of-accuracy/
- The Register, "OpenAI pulls AI text detector due to 'low rate of accuracy,'" July 26, 2023, theregister.com/2023/07/26/openai_pulls_ai_text_detector/