GPT-2: Difference between revisions
No edit summary |
No edit summary |
||
(One intermediate revision by the same user not shown) | |||
Line 1: | Line 1: | ||
{{Needs Expansion}} | {{Needs Expansion}} | ||
[[GPT-2]] is the 2nd [[GPT]] [[model]] released by [[OpenAI]] in February 2019. Although it is larger than its predecessor, [[GPT-1]], it is very similar. The main difference is that GPT-2 can [[multitask]]. It is able to perform well on multiple tasks without being [[trained]] on any [[examples]]. GPT-2 demonstrated that [[language model]] could better comprehend [[natural language]] and perform better on more tasks when it is trained on a larger [[dataset]] and when it has more [[parameters]]; the larger the language model is, the better it is. With over 1.5 billion parameters (10 times greater than GPT-1), GPT-2's performance exceeded the state-of-the-art for many tasks in [[zero shot]] settings. To get the [[training data]], the authors used [[Reddit]] to scrape article text from the outbound links of Reddit posts that had received upvotes. This created a large, high-quality, and extensive [[dataset]] called the [[WebText]]. The dataset contained 40GB of text from over 8 million documents, which is far greater than the amount of data GPT-1 was trained on. GPT-2 was tested on many downstream tasks, including [[translation]], [[summarisation]], [[question and answering]], and [[reading comprehension]]. | [[GPT-2]] is the 2nd [[GPT]] [[model]] released by [[OpenAI]] in February 2019 with the [[paper]] [[Language Models are Unsupervised Multitask Learners]]. Although it is larger than its predecessor, [[GPT-1]], it is very similar. The main difference is that GPT-2 can [[multitask]]. It is able to perform well on multiple tasks without being [[trained]] on any [[examples]]. GPT-2 demonstrated that [[language model]] could better comprehend [[natural language]] and perform better on more tasks when it is trained on a larger [[dataset]] and when it has more [[parameters]]; the larger the language model is, the better it is. With over 1.5 billion parameters (10 times greater than GPT-1), GPT-2's performance exceeded the state-of-the-art for many tasks in [[zero shot]] settings. To get the [[training data]], the authors used [[Reddit]] to scrape article text from the outbound links of Reddit posts that had received upvotes. This created a large, high-quality, and extensive [[dataset]] called the [[WebText]]. The dataset contained 40GB of text from over 8 million documents, which is far greater than the amount of data GPT-1 was trained on. GPT-2 was tested on many downstream tasks, including [[translation]], [[summarisation]], [[question and answering]], and [[reading comprehension]]. | ||
[[Category:Terms]] [[Category:Artificial intelligence terms]] [[Category:Models]] [[Category:GPT]] [[Category:OpenAI]] |
Latest revision as of 18:56, 2 March 2023
GPT-2 is the 2nd GPT model released by OpenAI in February 2019 with the paper Language Models are Unsupervised Multitask Learners. Although it is larger than its predecessor, GPT-1, it is very similar. The main difference is that GPT-2 can multitask. It is able to perform well on multiple tasks without being trained on any examples. GPT-2 demonstrated that language model could better comprehend natural language and perform better on more tasks when it is trained on a larger dataset and when it has more parameters; the larger the language model is, the better it is. With over 1.5 billion parameters (10 times greater than GPT-1), GPT-2's performance exceeded the state-of-the-art for many tasks in zero shot settings. To get the training data, the authors used Reddit to scrape article text from the outbound links of Reddit posts that had received upvotes. This created a large, high-quality, and extensive dataset called the WebText. The dataset contained 40GB of text from over 8 million documents, which is far greater than the amount of data GPT-1 was trained on. GPT-2 was tested on many downstream tasks, including translation, summarisation, question and answering, and reading comprehension.