Tokens: Difference between revisions

Tokens (view source)

118 bytes added , 6 April 2023

370

edits

@@ Line 25: / Line 25: @@
 *[https://platform.openai.com/tokenizer OpenAI's interactive Tokenizer tool]
-*[[Tiktoken]], a fast BPE tokenizer specifically for OpenAI models
+*[https://github.com/openai/tiktoken Tiktoken], a fast BPE tokenizer specifically for OpenAI models
 *[[Transformers]] package for Python
 *[https://www.npmjs.com/package/gpt-3-encoder gpt-3-encoder package for node.js]
@@ Line 45: / Line 45: @@
 *Uppercase at the beginning of a sentence: "Red" (token: "7738")
 *The more likely or common a token is, the lower the token number assigned to it. For example, the token for the period ("13") remains consistent in all three sentences because its usage is similar throughout the corpus data.
+<gallery mode=packed>
+File:tokens3.png
+File:tokens2.png
+File:tokens1.png
+</gallery>
 ==Prompt Design and Token Knowledge==