GPT-2: Difference between revisions

GPT-2 (view source)

Revision as of 18:28, 2 March 2023

63 bytes added , 2 March 2023

no edit summary

Elegant angel

57

edits

Revision as of 18:21, 2 March 2023 (view source) Elegant angel (talk \| contribs) (Created page with "{{Needs Expansion}} GPT-2 is the 2nd GPT model released by OpenAI in February 2019. Although it is larger than its predecessor, GPT-1, it is very similar. The main difference is that GPT-2 can multitask. It is able to perform well on multiple tasks without being trained on any examples. GPT-2 demonstrated that language model could better comprehend natural language and perform better on more tasks when it is trained on a larger d...")		Revision as of 18:28, 2 March 2023 (view source) Elegant angel (talk \| contribs) No edit summary Newer edit →
Line 1:		Line 1:
	{{Needs Expansion}}		{{Needs Expansion}}
	[[GPT-2]] is the 2nd [[GPT]] [[model]] released by [[OpenAI]] in February 2019. Although it is larger than its predecessor, [[GPT-1]], it is very similar. The main difference is that GPT-2 can [[multitask]]. It is able to perform well on multiple tasks without being [[trained]] on any [[examples]]. GPT-2 demonstrated that [[language model]] could better comprehend [[natural language]] and perform better on more tasks when it is trained on a larger [[dataset]] and when it has more [[parameters]]; the larger the language model is, the better it is. GPT-2 exceeded the state-of-the-art for many tasks in [[zero shot]] settings. To get the [[training data]], the authors used [[Reddit]] to scrape article text from the outbound links of Reddit posts that had received upvotes. This created a large, high-quality, and extensive [[dataset]] called the [[WebText]]. It contained 40GB of text ~~data~~, ~~a lot more~~ than GPT-1~~'s. GPT-2~~ was trained ~~using the WebText data and had 1.5 billion parameters. This is ten times greater than GPT-1~~. GPT-2 was tested on many downstream tasks, including summarisation, ~~translation,~~ question answering, and reading comprehension.		[[GPT-2]] is the 2nd [[GPT]] [[model]] released by [[OpenAI]] in February 2019. Although it is larger than its predecessor, [[GPT-1]], it is very similar. The main difference is that GPT-2 can [[multitask]]. It is able to perform well on multiple tasks without being [[trained]] on any [[examples]]. GPT-2 demonstrated that [[language model]] could better comprehend [[natural language]] and perform better on more tasks when it is trained on a larger [[dataset]] and when it has more [[parameters]]; the larger the language model is, the better it is. With over 1.5 billion parameters (10 times greater than GPT-1), GPT-2's performance exceeded the state-of-the-art for many tasks in [[zero shot]] settings. To get the [[training data]], the authors used [[Reddit]] to scrape article text from the outbound links of Reddit posts that had received upvotes. This created a large, high-quality, and extensive [[dataset]] called the [[WebText]]. The dataset contained 40GB of text from over 8 million documents, which is far greater than the amount of data GPT-1 was trained on. GPT-2 was tested on many downstream tasks, including [[translation]], [[summarisation]], [[question and answering]], and [[reading comprehension]].