Transformers

From AI Wiki

Transformers are deep learning models that have certain architectural characteristics. They were introduced by Google researchers in 2017, in the famous Attention is All you Need paper. The Transformer architecture is an example of the encoder-decoder models, which had been popular for a few years. Until then, however, attention was only one of the mechanisms that these models used. They were based mainly on LSTM and other RNNs. The Transformers paper revealed that attention could be used as the sole determinant between inputs and outputs.