GPT-1: Difference between revisions

← Older edit

GPT-1 (view source)

Revision as of 18:55, 2 March 2023

83 bytes added , 2 March 2023

no edit summary

Elegant angel

57

edits

@@ Line 1: / Line 1: @@
 {{Needs Expansion}}
-[[GPT-1]] is the first [[GPT]] [[model]] released by [[OpenAI]] in June 2018. The developers found that combining the [[transformer]] [[architecture]] and [[unsupervised pertaining]] produced amazing results. GPT-1, according to the developers, was tailored for specific tasks in order to "strongly understand natural language."
+[[GPT-1]] is the first [[GPT]] [[model]] released by [[OpenAI]] in June 2018 with the [[paper]] [[Improving Language Understanding by Generative Pre-Training]]. The developers found that combining the [[transformer]] [[architecture]] and [[unsupervised pertaining]] produced amazing results. GPT-1, according to the developers, was tailored for specific tasks in order to "strongly understand natural language."
 GPT-1 was an important stepping stone toward a [[language model]] that possesses general [[language-based abilities]]s. It showed that language models can be efficiently [[pre-trained]] which may help them generalize. The architecture was capable of performing many [[NLP]] tasks with minimal [[fine-tuning]]. GPT-1 used [[BooksCorpus]], which includes some 7,000 books. [[Self-attention]] was required in the transformer's coder to train the model. The architecture was identical to the original transformer and had 117 million [[parameters]]. This model was a precursor to the larger models (models with far more parameters and trained on larger [[datasets]]) later on. Its notable ability was its performance on [[zero shot]] [[tasks]] in [[natural language processing]] (e.g., [[question-answering]] or [[sentiment analysis]]) thanks to its [[pre-training]]. [[Zero-shot learning]] means that a model can perform a task without seeing [[examples]]. [[Zero-shot task transfer]]s is when the model is given very few examples. Instead, they must learn the task from the [[instructions]] and very few examples.
 [[Category:Terms]] [[Category:Artificial intelligence terms]] [[Category:Models]] [[Category:GPT]] [[Category:OpenAI]]