Papers

Certain elements of this article are incomplete. You can help the AI Wiki by expanding it.

Important Papers

Name	Submission Date	Source	Type	Organization	Note
ImageNet Classification with Deep Convolutional Neural Networks (AlexNet)	2012	AlexNet Paper			AlexNet
Efficient Estimation of Word Representations in Vector Space (Word2Vec)	2013/01/16	arxiv:1301.3781	NLP		Word2Vec
Playing Atari with Deep Reinforcement Learning (DQN)	2013/12/19	arxiv:1312.5602			DQN
Generative Adversarial Networks (GAN)	2014/06/10	arxiv:1406.2661			GAN
Very Deep Convolutional Networks for Large-Scale Image Recognition (VGGNet)	2014/09/04	arxiv:409.1556			VGGNet
Sequence to Sequence Learning with Neural Networks (Seq2Seq)	2014/09/10	arxiv:1409.3215			Seq2Seq
Adam: A Method for Stochastic Optimization)	2014/12/22	arxiv:1412.6980			Adam
Deep Residual Learning for Image Recognition (ResNet)	2015/12/10	arxiv:409.1556			ResNet
Going Deeper with Convolutions (GoogleNet)	2015/12/10	arxiv:409.1556		Google	GoogleNet
Asynchronous Methods for Deep Reinforcement Learning (A3C)	2016/02/04	arxiv:1602.01783			A3C
WaveNet: A Generative Model for Raw Audio	2016/09/12	arxiv:1609.03499	Audio		WaveNet
Attention Is All You Need (Transformer)	2017/06/12	arxiv:1706.03762		Google	Influential paper that introduced Transformer
Proximal Policy Optimization Algorithms (PPO)	2017/07/20	arxiv:1707.06347			PPO
Improving Language Understanding by Generative Pre-Training (GPT)	2018	paper source	NLP	OpenAI	GPT
Deep contextualized word representations (ELMo)	2018/02/15	arxiv:1802.05365	NLP		ELMo
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding	2018/04/20	arxiv:1804.07461	NLP		GLUE
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding	2018/10/11	arxiv:1810.04805	NLP	Google	BERT
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context	2019/01/09	arxiv:1901.02860			Transformer-XL
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model (MuZero)	2019/11/19	arxiv:1911.08265			MuZero
Language Models are Few-Shot Learners (GPT-3)	2020/05/28	arxiv:2005.14165	NLP	OpenAI	GPT-3
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (ViT)	2020/10/22	arxiv:2010.11929			Vision Transformer (ViT)
Learning Transferable Visual Models From Natural Language Supervision (CLIP)	2021/02/26	arxiv:2103.00020 OpenAI Blog
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer	2021/10/05	arxiv:2110.02178			MobileViT
Block-Recurrent Transformers	2022/03/11	arxiv:2203.07852
Memorizing Transformers	2022/03/16	arxiv:2203.08913
STaR: Bootstrapping Reasoning With Reasoning	2022/03/28	arxiv:2203.14465			STaR

Other Papers

Name	Submission Date	Source	Type	Organization	Note
Dreamix: Video Diffusion Models are General Video Editors	2023/02/03	arxiv:2302.01329 blog post		Google	Dreamix
FLAME: A small language model for spreadsheet formulas	2023/01/31	arxiv:2301.13779		Microsoft	FLAME
SingSong: Generating musical accompaniments from singing	2023/01/30	arxiv:2301.12662 blog post	Audio		SingSong
MusicLM: Generating Music From Text	2023/01/26	arxiv:2301.11325 blog post	Audio	Google	MusicLM
Mastering Diverse Domains through World Models (DreamerV3)	2023/01/10	arxiv:2301.04104v1		DeepMind	DreamerV3
Muse: Text-To-Image Generation via Masked Generative Transformers	2023/01/02	arxiv:2301.00704 blog post		Google	Muse
Constitutional AI: Harmlessness from AI Feedback	2021/12/12	arxiv:2212.08073		Anthropic	Constitutional AI, Claude
InstructPix2Pix: Learning to Follow Image Editing Instructions	2021/11/17	arxiv:2211.09800 Blog Post		UC Berkley	InstructPix2Pix