Papers: Difference between revisions

Revision as of 02:33, 8 February 2023

	This page needs additional information.
	Key elements of this article are missing. You can help AI Wiki by expanding it.

Important Papers

Name	Submission Date	Source	Type	Organization	Note
ImageNet Classification with Deep Convolutional Neural Networks (AlexNet)	2012	AlexNet Paper	Computer Vision		AlexNet
Efficient Estimation of Word Representations in Vector Space (Word2Vec)	2013/01/16	arxiv:1301.3781	Natural Language Processing		Word2Vec
Playing Atari with Deep Reinforcement Learning (DQN)	2013/12/19	arxiv:1312.5602			DQN (Deep Q-Learning)
Generative Adversarial Networks (GAN)	2014/06/10	arxiv:1406.2661			GAN (Generative Adversarial Network)
Very Deep Convolutional Networks for Large-Scale Image Recognition (VGGNet)	2014/09/04	arxiv:1409.1556	Computer Vision		VGGNet
Sequence to Sequence Learning with Neural Networks (Seq2Seq)	2014/09/10	arxiv:1409.3215	Natural Language Processing		Seq2Seq
Adam: A Method for Stochastic Optimization	2014/12/22	arxiv:1412.6980			Adam
Deep Residual Learning for Image Recognition (ResNet)	2015/12/10	arxiv:1512.03385	Computer Vision		ResNet
Going Deeper with Convolutions (GoogleNet)	2015/12/10	arxiv:1409.4842	Computer Vision	Google	GoogleNet
Asynchronous Methods for Deep Reinforcement Learning (A3C)	2016/02/04	arxiv:1602.01783			A3C
WaveNet: A Generative Model for Raw Audio	2016/09/12	arxiv:1609.03499	Audio		WaveNet
Attention Is All You Need (Transformer)	2017/06/12	arxiv:1706.03762	Natural Language Processing	Google	Influential paper that introduced Transformer
Proximal Policy Optimization Algorithms (PPO)	2017/07/20	arxiv:1707.06347			PPO (Proximal Policy Optimization)
Improving Language Understanding by Generative Pre-Training (GPT)	2018	paper source	Natural Language Processing	OpenAI	GPT (Generative Pre-Training)
Deep contextualized word representations (ELMo)	2018/02/15	arxiv:1802.05365	Natural Language Processing		ELMo
GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding	2018/04/20	arxiv:1804.07461 website	Natural Language Processing		GLUE (General Language Understanding Evaluation)
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding	2018/10/11	arxiv:1810.04805	Natural Language Processing	Google	BERT (Bidirectional Encoder Representations from Transformers)
Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context	2019/01/09	arxiv:1901.02860 github			Transformer-XL
Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model (MuZero)	2019/11/19	arxiv:1911.08265			MuZero
Language Models are Few-Shot Learners (GPT-3)	2020/05/28	arxiv:2005.14165	Natural Language Processing	OpenAI	GPT-3
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale (ViT)	2020/10/22	arxiv:2010.11929 GitHub	Computer Vision	Google	ViT (Vision Transformer)
Learning Transferable Visual Models From Natural Language Supervision (CLIP)	2021/02/26	arxiv:2103.00020 Blog Post	Computer Vision	OpenAI	CLIP (Contrastive Language-Image Pre-Training)
MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer	2021/10/05	arxiv:2110.02178 GitHub	Computer Vision	Apple	MobileViT
LaMDA: Language Models for Dialog Applications	2022/01/20	arxiv:2201.08239 Blog Post	Natural Language Processing	Google	LaMDA (Language Models for Dialog Applications)
Block-Recurrent Transformers	2022/03/11	arxiv:2203.07852
Memorizing Transformers	2022/03/16	arxiv:2203.08913
STaR: Bootstrapping Reasoning With Reasoning	2022/03/28	arxiv:2203.14465			STaR (Self-Taught Reasoner)

Other Papers

Name	Submission Date	Source	Type	Organization	Note
Dreamix: Video Diffusion Models are General Video Editors	2023/02/03	arxiv:2302.01329 blog post		Google	Dreamix
FLAME: A small language model for spreadsheet formulas	2023/01/31	arxiv:2301.13779		Microsoft	FLAME
SingSong: Generating musical accompaniments from singing	2023/01/30	arxiv:2301.12662 blog post	Audio		SingSong
MusicLM: Generating Music From Text	2023/01/26	arxiv:2301.11325 blog post	Audio	Google	MusicLM
Mastering Diverse Domains through World Models (DreamerV3)	2023/01/10	arxiv:2301.04104v1 blogpost		DeepMind	DreamerV3
Muse: Text-To-Image Generation via Masked Generative Transformers	2023/01/02	arxiv:2301.00704 blog post	Computer Vision	Google	Muse
Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding (Imagen)	2022/05/23	arxiv:2205.11487 Blog Post	Computer Vision	Google	Imagen
Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback	2022/04/12	arxiv:2204.05862 GitHub	Natural Language Processing	Anthropic	RLHF (Reinforcement Learning from Human Feedback)
PaLM: Scaling Language Modeling with Pathways	2022/04/05	arxiv:2204.02311 Blog Post	Natural Language Processing	Google	PaLM (Pathways Language Model)
Constitutional AI: Harmlessness from AI Feedback	2021/12/12	arxiv:2212.08073	Natural Language Processing	Anthropic	Constitutional AI, Claude
Improving language models by retrieving from trillions of tokens (RETRO)	2021/12/08	arxiv:2112.04426 Blog post	Natural Language Processing	OpenAI	RETRO (Retrieval Enhanced Transformer)
InstructPix2Pix: Learning to Follow Image Editing Instructions	2021/11/17	arxiv:2211.09800 Blog Post	Computer Vision	UC Berkley	InstructPix2Pix
REALM: Retrieval-Augmented Language Model Pre-Training	2020/02/10	arxiv:2002.08909 Blog Post	Natural Language Processing	Google	REALM (Retrieval-Augmented Language Model Pre-Training)

@@ Line 40: / Line 40: @@
 |[[Deep contextualized word representations (ELMo)]] || 2018/02/15 || [[arxiv:1802.05365]] || [[Natural Language Processing]] ||  || [[ELMo]]
 |-
-|[[GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding]] || 2018/04/20 || [[arxiv:1804.07461]]<br>[https://gluebenchmark.com/ website] || [[Natural Language Processing]] ||  || [[GLUE]]
+|[[GLUE: A Multi-Task Benchmark and Analysis Platform for Natural Language Understanding]] || 2018/04/20 || [[arxiv:1804.07461]]<br>[https://gluebenchmark.com/ website] || [[Natural Language Processing]] ||  || [[GLUE]] ([[General Language Understanding Evaluation]])
 |-
-|[[BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding]] || 2018/10/11 || [[arxiv:1810.04805]] || [[Natural Language Processing]] || [[Google]] || [[BERT]]
+|[[BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding]] || 2018/10/11 || [[arxiv:1810.04805]] || [[Natural Language Processing]] || [[Google]] || [[BERT]] ([[Bidirectional Encoder Representations from Transformers]])
 |-
 |[[Transformer-XL: Attentive Language Models Beyond a Fixed-Length Context]] || 2019/01/09 || [[arxiv:1901.02860]]<br>[https://github.com/kimiyoung/transformer-xl github] ||  ||  || [[Transformer-XL]]
@@ Line 56: / Line 56: @@
 |[[MobileViT: Light-weight, General-purpose, and Mobile-friendly Vision Transformer]] || 2021/10/05 || [[arxiv:2110.02178]]<br>[https://github.com/apple/ml-cvnets GitHub] || [[Computer Vision]] || [[Apple]] || [[MobileViT]]
 |-
-|[[LaMDA: Language Models for Dialog Applications]] || 2022/01/20 || [[arxiv:2201.08239]]<br>[https://blog.google/technology/ai/lamda/ Blog Post] || [[Natural Language Processing]] || [[Google]] || [[LaMDA]]
+|[[LaMDA: Language Models for Dialog Applications]] || 2022/01/20 || [[arxiv:2201.08239]]<br>[https://blog.google/technology/ai/lamda/ Blog Post] || [[Natural Language Processing]] || [[Google]] || [[LaMDA]] (Language Models for Dialog Applications)
 |-
 |[[Block-Recurrent Transformers]] || 2022/03/11 || [[arxiv:2203.07852]] ||  ||  ||
@@ Line 62: / Line 62: @@
 |[[Memorizing Transformers]] || 2022/03/16 ||[[arxiv:2203.08913]] ||  ||  ||
 |-
-|[[STaR: Bootstrapping Reasoning With Reasoning]] || 2022/03/28 || [[arxiv:2203.14465]] ||  ||  || [[STaR]]
+|[[STaR: Bootstrapping Reasoning With Reasoning]] || 2022/03/28 || [[arxiv:2203.14465]] ||  ||  || [[STaR]] ([[Self-Taught Reasoner]])
 |-
 |}
@@ Line 92: / Line 92: @@
 |[[Training a Helpful and Harmless Assistant with Reinforcement Learning from Human Feedback]] || 2022/04/12 || [[arxiv:2204.05862]]<br>[https://github.com/anthropics/hh-rlhf GitHub] || [[Natural Language Processing]] || [[Anthropic]] || [[RLHF]] ([[Reinforcement Learning from Human Feedback]])
 |-
-|[[PaLM: Scaling Language Modeling with Pathways]] || 2022/04/05 || [[arxiv:2204.02311]]<br>[https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html Blog Post] || [[Natural Language Processing]] || [[Google]] || [[PaLM]]
+|[[PaLM: Scaling Language Modeling with Pathways]] || 2022/04/05 || [[arxiv:2204.02311]]<br>[https://ai.googleblog.com/2022/04/pathways-language-model-palm-scaling-to.html Blog Post] || [[Natural Language Processing]] || [[Google]] || [[PaLM]] ([[Pathways Language Model]])
 |-
 |[[Constitutional AI: Harmlessness from AI Feedback]] || 2021/12/12 || [[arxiv:2212.08073]] || [[Natural Language Processing]] || [[Anthropic]] || [[Constitutional AI]], [[Claude]]
@@ Line 100: / Line 100: @@
 |[[InstructPix2Pix: Learning to Follow Image Editing Instructions]] || 2021/11/17 || [[arxiv:2211.09800]]<br>[https://www.timothybrooks.com/instruct-pix2pix Blog Post] || [[Computer Vision]] || [[UC Berkley]] || [[InstructPix2Pix]]
 |-
-|[[REALM: Retrieval-Augmented Language Model Pre-Training]] || 2020/02/10 || [[arxiv:2002.08909]]<br>[https://ai.googleblog.com/2020/08/realm-integrating-retrieval-into.html Blog Post] || [[Natural Language Processing]] || [[Google]] || [[REALM]]
+|[[REALM: Retrieval-Augmented Language Model Pre-Training]] || 2020/02/10 || [[arxiv:2002.08909]]<br>[https://ai.googleblog.com/2020/08/realm-integrating-retrieval-into.html Blog Post] || [[Natural Language Processing]] || [[Google]] || [[REALM]] ([[Retrieval-Augmented Language Model Pre-Training]])
 |-
 |}