←
Back
to glossary
Transformer
The Transformer is a neural network architecture introduced in the “Attention Is All You Need” paper, widely used in large language models. It replaces traditional recurrent neural networks (RNNs) with attention mechanisms and self-attention layers for capturing dependencies and improving parallel processing in sequential data tasks.