Papers

Revision as of 21:49, 5 February 2023 by Alpha5 (talk | contribs)

Important

Name Date Source Note
Attention Is All You Need arxiv:1706.03762 influential paper that introduced Transformer
An Image is Worth 16x16 Words arxiv:2010.11929 Transformers for Image Recognition at Scale - Vision Transformer (ViT)
Block-Recurrent Transformers https://arxiv.org/abs/2203.07852
Language Models are Few-Shot Learners https://arxiv.org/abs/2005.14165 GPT
Memorizing Transformers https://arxiv.org/abs/2203.08913
MobileViT https://arxiv.org/abs/2110.02178 Light-weight, General-purpose, and Mobile-friendly Vision Transformer
OpenAI CLIP https://arxiv.org/abs/2103.00020, https://openai.com/blog/clip/ Learning Transferable Visual Models From Natural Language Supervision
STaR https://arxiv.org/abs/2203.14465 Bootstrapping Reasoning With Reasoning
Transformer-XL https://arxiv.org/abs/1901.02860 Attentive Language Models Beyond a Fixed-Length Context

Others

https://arxiv.org/abs/2301.13779 (FLAME: A small language model for spreadsheet formulas) - Small model specifically for spreadsheets by Miscrofot