Papers: Difference between revisions

73 bytes added ,  5 February 2023
no edit summary
No edit summary
No edit summary
Line 4: Line 4:


'''[[An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale]]''' - https://arxiv.org/abs/2010.11929 - [[Vision Transformer]] ([[ViT]])
'''[[An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale]]''' - https://arxiv.org/abs/2010.11929 - [[Vision Transformer]] ([[ViT]])
'''[[Memorizing Transformers]]''' - https://arxiv.org/abs/2203.08913 -


'''[[OpenAI CLIP]]''' - https://arxiv.org/abs/2103.00020, https://openai.com/blog/clip/ - Learning Transferable Visual Models From Natural Language Supervision
'''[[OpenAI CLIP]]''' - https://arxiv.org/abs/2103.00020, https://openai.com/blog/clip/ - Learning Transferable Visual Models From Natural Language Supervision


'''[[Transformer-XL]]''' - https://arxiv.org/abs/1901.02860 - Attentive Language Models Beyond a Fixed-Length Context
'''[[Transformer-XL]]''' - https://arxiv.org/abs/1901.02860 - Attentive Language Models Beyond a Fixed-Length Context