Transformers

From AI Wiki
Revision as of 05:30, 17 February 2023 by Yogurt frame (talk | contribs) (Created page with "Transformers are deep learning models that have certain architectural characteristics. They were introduced by Google researchers in 2017, in the famous Attention is All you Need paper. The Transformer architecture is an example of the encoder-decoder models, which had been popular for a few years. Until then, however, attention was only one of the mechanisms that these models used. They were based mainly on LSTM and other RNNs. The Tr...")
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Transformers are deep learning models that have certain architectural characteristics. They were introduced by Google researchers in 2017, in the famous Attention is All you Need paper. The Transformer architecture is an example of the encoder-decoder models, which had been popular for a few years. Until then, however, attention was only one of the mechanisms that these models used. They were based mainly on LSTM and other RNNs. The Transformers paper revealed that attention could be used as the sole determinant between inputs and outputs.