GPT-2: Difference between revisions

← Older edit

GPT-2 (view source)

Revision as of 18:56, 2 March 2023

75 bytes added , 2 March 2023

no edit summary

Elegant angel

57

edits

@@ Line 1: / Line 1: @@
 {{Needs Expansion}}
-[[GPT-2]] is the 2nd [[GPT]] [[model]] released by [[OpenAI]] in February 2019. Although it is larger than its predecessor, [[GPT-1]], it is very similar. The main difference is that GPT-2 can [[multitask]]. It is able to perform well on multiple tasks without being [[trained]] on any [[examples]]. GPT-2 demonstrated that [[language model]] could better comprehend [[natural language]] and perform better on more tasks when it is trained on a larger [[dataset]] and when it has more [[parameters]]; the larger the language model is, the better it is. With over 1.5 billion parameters (10 times greater than GPT-1), GPT-2's performance exceeded the state-of-the-art for many tasks in [[zero shot]] settings. To get the [[training data]], the authors used [[Reddit]] to scrape article text from the outbound links of Reddit posts that had received upvotes. This created a large, high-quality, and extensive [[dataset]] called the [[WebText]]. The dataset contained 40GB of text from over 8 million documents, which is far greater than the amount of data GPT-1 was trained on. GPT-2 was tested on many downstream tasks, including [[translation]], [[summarisation]], [[question and answering]], and [[reading comprehension]].
+[[GPT-2]] is the 2nd [[GPT]] [[model]] released by [[OpenAI]] in February 2019 with the [[paper]] [[Language Models are Unsupervised Multitask Learners]]. Although it is larger than its predecessor, [[GPT-1]], it is very similar. The main difference is that GPT-2 can [[multitask]]. It is able to perform well on multiple tasks without being [[trained]] on any [[examples]]. GPT-2 demonstrated that [[language model]] could better comprehend [[natural language]] and perform better on more tasks when it is trained on a larger [[dataset]] and when it has more [[parameters]]; the larger the language model is, the better it is. With over 1.5 billion parameters (10 times greater than GPT-1), GPT-2's performance exceeded the state-of-the-art for many tasks in [[zero shot]] settings. To get the [[training data]], the authors used [[Reddit]] to scrape article text from the outbound links of Reddit posts that had received upvotes. This created a large, high-quality, and extensive [[dataset]] called the [[WebText]]. The dataset contained 40GB of text from over 8 million documents, which is far greater than the amount of data GPT-1 was trained on. GPT-2 was tested on many downstream tasks, including [[translation]], [[summarisation]], [[question and answering]], and [[reading comprehension]].
 [[Category:Terms]] [[Category:Artificial intelligence terms]] [[Category:Models]] [[Category:GPT]] [[Category:OpenAI]]