Interface administrators, Administrators (Semantic MediaWiki), Curators (Semantic MediaWiki), Editors (Semantic MediaWiki), Suppressors, Administrators
7,785
edits
No edit summary |
No edit summary |
||
Line 52: | Line 52: | ||
|[[An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (ViT)]] || 2020/10/22 || [[arxiv:2010.11929]]<br>[https://github.com/google-research/vision_transformer GitHub] || [[Computer Vision]] || [[Google]] || [[Vision Transformer]] ([[ViT]]) | |[[An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (ViT)]] || 2020/10/22 || [[arxiv:2010.11929]]<br>[https://github.com/google-research/vision_transformer GitHub] || [[Computer Vision]] || [[Google]] || [[Vision Transformer]] ([[ViT]]) | ||
|- | |- | ||
|[[Learning Transferable Visual Models From Natural Language Supervision (CLIP)]] || 2021/02/26 || [[arxiv:2103.00020]]<br>[https://openai.com/blog/clip/ Blog Post] || [[Computer Vision]] || [[OpenAI]] || [[CLIP]] | |[[Learning Transferable Visual Models From Natural Language Supervision (CLIP)]] || 2021/02/26 || [[arxiv:2103.00020]]<br>[https://openai.com/blog/clip/ Blog Post] || [[Computer Vision]] || [[OpenAI]] || [[CLIP]] ([[Contrastive Language-Image Pre-Training]]) | ||
|- | |- | ||
|[[MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer]] || 2021/10/05 || [[arxiv:2110.02178]]<br>[https://github.com/apple/ml-cvnets GitHub] || [[Computer Vision]] || [[Apple]] || [[MobileViT]] | |[[MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer]] || 2021/10/05 || [[arxiv:2110.02178]]<br>[https://github.com/apple/ml-cvnets GitHub] || [[Computer Vision]] || [[Apple]] || [[MobileViT]] |