GPT: Difference between revisions

Latest revision as of 18:00, 15 July 2023

	This page needs additional information.
	Key elements of this article are missing. You can help AI Wiki by expanding it.

GPT API

Documentation and Guide for GPT API

Main GPT Models

Model	Release Date	Parameters	Context Window			Training Data		Open Source	Paper
Model	Release Date	Parameters	Tokens	Words	Equivalent	Amount	Dataset	Open Source	Paper
GPT-1	June 11, 2018	117 Million	512	358	<1 page	4.5 GB	BookCorpus	Yes	Improving Language Understanding by Generative Pre-Training
GPT-2	February 14, 2019	1.5 Billion	1024	716	1.5 pages	40 GB	WebText	Yes	Language Models are Unsupervised Multitask Learners
GPT-3	June 11, 2020	175 Billion	2,048	1,433	3 pages	570 GB	Common Crawl, WebText2, Books1, Books2, Wikipedia	No	Language Models are Few-Shot Learners
GPT-3.5	March 15, 2022	175 Billion	4,000	2,800	6 pages	570 GB	Common Crawl, WebText2, Books1, Books2, Wikipedia	No
ChatGPT	November 30, 2022	175 Billion	4,096	2,867	6 pages	570 GB	Common Crawl, WebText2, Books1, Books2, Wikipedia	No
GPT-4 (v1)	????	100 Trillion??	8,000	5,600	12 pages			No
GPT-4 (v2)	????	100 Trillion??	32,000	22,400	50 pages			No

What is GPT

GPT, which stands for Generative Pre-trained Transformer, is a type of language model developed by OpenAI. Based on the Transformer architecture and utilizes unsupervised learning, GPT is able to generate text indistinguishable from text written by humans.

How does GPT work?

Using unsupervised learning, GPT is able to generate text by predicting the next word in a sentence based on the context of the words that came before it. The model is trained on a large corpus of text, such as the entire internet, to learn the patterns and relationships between words and how they are used in context.

The Transformer architecture is used in GPT because it is well-suited for processing sequential data, such as text. The Transformer consists of self-attention mechanisms, which allow the model to focus on specific words in a sentence and learn the relationships between them.

Once the model is trained, it can be used for a variety of natural language processing (NLP) tasks, such as text generation, translation, and summarization. The model can also be fine-tuned for specific tasks by training it on smaller, task-specific datasets.

Applications of GPT

GPT has a wide range of applications in the field of NLP. Some of the most notable applications include:

Text generation: GPT can be used to generate coherent and coherent text that resembles human writing.
Chatbots: GPT can be used to create conversational agents that can answer questions and hold conversations with users.
Text classification: GPT can be fine-tuned for text classification tasks, such as sentiment analysis or spam detection.
Translation: GPT can be used to translate text from one language to another.

Explain Like I'm 5 (ELI5)

GPT is a computer program that can write sentences like a person. It learned how to do this by reading lots of text on the internet. Now it can write new sentences and answer questions by using what it learned. It's like a smart robot that can talk and write in a way that sounds like a person!

@@ Line 1: / Line 1: @@
 {{Needs Expansion}}
+==GPT API==
+'''[[GPT API|Documentation and Guide for GPT API]]'''
 ==Main GPT Models==
-{| class="wikitable"
+{| class="wikitable sortable"
 |-
-! rowspan="3" | Model
+! rowspan="2" | Model
-! rowspan="3" | Release Date
+! rowspan="2" | Release Date
-! rowspan="3" | Parameters
+! rowspan="2" | Parameters
-! colspan="3" rowspan="2" | Context Window
+! colspan="3" | Context Window
-! colspan="2" rowspan="7" | Training Data
+! colspan="2" | Training Data
-|-
+! rowspan="2" | Open Source
-! rowspan="2" | Amount
+! rowspan="2" | Paper
-! colspan="5" | Type
 |-
 ! Tokens
@@ Line 16: / Line 18: @@
 ! Equivalent
 ! Amount
-! BookCorpus
+! Dataset
-! WebText
-! CommonCrawl
-! hello
-! hello
 |-
-| '''[[GPT-1]]''' || June 11, 2018 ||117 Million || 512 || 358 ||  ||  ||  ||  ||
+| '''[[GPT-1]]''' || June 11, 2018 ||117 Million || 512 || 358 || <1 page || 4.5 GB || [[BookCorpus]] || Yes || [[Improving Language Understanding by Generative Pre-Training]]
 |-
-| '''[[GPT-2]]''' ||  February 14, 2019 || 1.5 Billion || 1024 || 716 ||  ||  ||  ||  ||
+| '''[[GPT-2]]''' ||  February 14, 2019 || 1.5 Billion || 1024 || 716 || 1.5 pages || 40 GB || [[WebText]] || Yes || [[Language Models are Unsupervised Multitask Learners]]
 |-
-| '''[[GPT-3]]''' || June 11, 2020 || 175 Billion || 2,048 || 1,433 ||  ||  ||  ||  ||
+| '''[[GPT-3]]''' || June 11, 2020 || 175 Billion || 2,048 || 1,433 || 3 pages || 570 GB || [[Common Crawl]], [[WebText2]], [[Books1]], [[Books2]], [[Wikipedia]] || No || [[Language Models are Few-Shot Learners]]
 |-
-| '''[[GPT-3.5]]''' || January 2022 || 1.3 Bilion, 6 Billion, 175 Billion  || 4,000 || 2,800 ||  ||  ||  ||  ||
+| '''[[GPT-3.5]]''' || March 15, 2022 || 175 Billion  || 4,000 || 2,800 || 6 pages || 570 GB || [[Common Crawl]], [[WebText2]], [[Books1]], [[Books2]], [[Wikipedia]] || No ||
 |-
-| '''[[ChatGPT]]''' || November 30, 2022 || xxx || 4,096 || 2,867 ||  ||  ||  ||  ||
+| '''[[ChatGPT]]''' || November 30, 2022 || 175 Billion || 4,096 || 2,867 || 6 pages || 570 GB || [[Common Crawl]], [[WebText2]], [[Books1]], [[Books2]], [[Wikipedia]] || No ||
 |-
-| '''[[GPT-4]]'''<br>(v1) || ???? || xxx || 8,000 || 5,600 ||  ||  ||  ||  ||
+| '''[[GPT-4]]'''<br>(v1) || ???? || 100 Trillion?? || 8,000 || 5,600 || 12 pages ||  || || No ||
 |-
-| '''[[GPT-4]]'''<br>(v2) || ???? || xxx || 32,000 || 22,400 ||  ||  ||  ||  ||
+| '''[[GPT-4]]'''<br>(v2) || ???? || 100 Trillion?? || 32,000 || 22,400 || 50 pages ||  || || No ||
 |-
 |}