← Back to glossary

Transformer

The Transformer is a neural network architecture introduced in the “Attention Is All You Need” paper, widely used in large language models. It replaces traditional recurrent neural networks (RNNs) with attention mechanisms and self-attention layers for capturing dependencies and improving parallel processing in sequential data tasks.

Related content