GPT-2: Difference between revisions

Revision as of 18:29, 2 March 2023

	This page needs additional information.
	Key elements of this article are missing. You can help AI Wiki by expanding it.

GPT-2 is the 2nd GPT model released by OpenAI in February 2019. Although it is larger than its predecessor, GPT-1, it is very similar. The main difference is that GPT-2 can multitask. It is able to perform well on multiple tasks without being trained on any examples. GPT-2 demonstrated that language model could better comprehend natural language and perform better on more tasks when it is trained on a larger dataset and when it has more parameters; the larger the language model is, the better it is. With over 1.5 billion parameters (10 times greater than GPT-1), GPT-2's performance exceeded the state-of-the-art for many tasks in zero shot settings. To get the training data, the authors used Reddit to scrape article text from the outbound links of Reddit posts that had received upvotes. This created a large, high-quality, and extensive dataset called the WebText. The dataset contained 40GB of text from over 8 million documents, which is far greater than the amount of data GPT-1 was trained on. GPT-2 was tested on many downstream tasks, including translation, summarisation, question and answering, and reading comprehension.

@@ Line 1: / Line 1: @@
 {{Needs Expansion}}
 [[GPT-2]] is the 2nd [[GPT]] [[model]] released by [[OpenAI]] in February 2019. Although it is larger than its predecessor, [[GPT-1]], it is very similar. The main difference is that GPT-2 can [[multitask]]. It is able to perform well on multiple tasks without being [[trained]] on any [[examples]]. GPT-2 demonstrated that [[language model]] could better comprehend [[natural language]] and perform better on more tasks when it is trained on a larger [[dataset]] and when it has more [[parameters]]; the larger the language model is, the better it is. With over 1.5 billion parameters (10 times greater than GPT-1), GPT-2's performance exceeded the state-of-the-art for many tasks in [[zero shot]] settings. To get the [[training data]], the authors used [[Reddit]] to scrape article text from the outbound links of Reddit posts that had received upvotes. This created a large, high-quality, and extensive [[dataset]] called the [[WebText]]. The dataset contained 40GB of text from over 8 million documents, which is far greater than the amount of data GPT-1 was trained on. GPT-2 was tested on many downstream tasks, including [[translation]], [[summarisation]], [[question and answering]], and [[reading comprehension]].
+[[Category:Terms]] [[Category:Artificial intelligence terms]] [[Category:Models]] [[Category:GPT]] [[Category:OpenAI]]