Papers: Difference between revisions
No edit summary |
No edit summary |
||
Line 3: | Line 3: | ||
|'''[[Attention Is All You Need]]''' || [[arxiv:1706.03762]] || influential paper that introduced [[Transformer]] | |'''[[Attention Is All You Need]]''' || [[arxiv:1706.03762]] || influential paper that introduced [[Transformer]] | ||
|- | |- | ||
| | |'''[[An Image is Worth 16x16 Words]]''' || [[arxiv:2010.11929]] || Transformers for Image Recognition at Scale - [[Vision Transformer]] ([[ViT]]) | ||
| | |||
|- | |- | ||
| | |'''[[Block-Recurrent Transformers]]''' || https://arxiv.org/abs/2203.07852 || | ||
| | |- | ||
|'''[[Language Models are Few-Shot Learners]]''' || https://arxiv.org/abs/2005.14165 || [[GPT]] | |||
|- | |||
|'''[[Memorizing Transformers]]''' || https://arxiv.org/abs/2203.08913 || | |||
|- | |||
|'''[[MobileViT]]''' || https://arxiv.org/abs/2110.02178 || Light-weight, General-purpose, and Mobile-friendly Vision Transformer | |||
|- | |||
|'''[[OpenAI CLIP]]''' || https://arxiv.org/abs/2103.00020, https://openai.com/blog/clip/ || Learning Transferable Visual Models From Natural Language Supervision | |||
|- | |||
|'''[[STaR]]''' || https://arxiv.org/abs/2203.14465 || Bootstrapping Reasoning With Reasoning | |||
|- | |||
|'''[[Transformer-XL]]''' || https://arxiv.org/abs/1901.02860 || Attentive Language Models Beyond a Fixed-Length Context | |||
|} | |} | ||
===Others=== | ===Others=== | ||
https://arxiv.org/abs/2301.13779 ([[FLAME: A small language model for spreadsheet formulas]]) - Small model specifically for spreadsheets by [[Miscrofot]] | https://arxiv.org/abs/2301.13779 ([[FLAME: A small language model for spreadsheet formulas]]) - Small model specifically for spreadsheets by [[Miscrofot]] |
Revision as of 21:48, 5 February 2023
Important
Attention Is All You Need | arxiv:1706.03762 | influential paper that introduced Transformer |
An Image is Worth 16x16 Words | arxiv:2010.11929 | Transformers for Image Recognition at Scale - Vision Transformer (ViT) |
Block-Recurrent Transformers | https://arxiv.org/abs/2203.07852 | |
Language Models are Few-Shot Learners | https://arxiv.org/abs/2005.14165 | GPT |
Memorizing Transformers | https://arxiv.org/abs/2203.08913 | |
MobileViT | https://arxiv.org/abs/2110.02178 | Light-weight, General-purpose, and Mobile-friendly Vision Transformer |
OpenAI CLIP | https://arxiv.org/abs/2103.00020, https://openai.com/blog/clip/ | Learning Transferable Visual Models From Natural Language Supervision |
STaR | https://arxiv.org/abs/2203.14465 | Bootstrapping Reasoning With Reasoning |
Transformer-XL | https://arxiv.org/abs/1901.02860 | Attentive Language Models Beyond a Fixed-Length Context |
Others
https://arxiv.org/abs/2301.13779 (FLAME: A small language model for spreadsheet formulas) - Small model specifically for spreadsheets by Miscrofot