Deep Learning Fundamentals
- Deep Learning Fundamentals
- Unit 1Intro to ML and DL
- Unit 2Using Tensors w/ PyTorch
- Unit 3Model Training in PyTorch
- Unit 3.1Using Logistic Regression for Classification
- Unit 3.2The Logistic Regression Computation Graph
- Unit 3.3Model Training with Stochastic Gradient Descent
- Unit 3.4Automatic Differentiation in PyTorch
- Unit 3.5The PyTorch API
- Unit 3.6Training a Logistic Regression Model in PyTorch
- Unit 3.7 Feature Normalization
- Unit 3 ExercisesUnit 3 Exercies
- Unit 4Training Multilayer Neural Networks Overview
- Unit 4.1Logistic Regression for Multiple Classes
- Unit 4.2Multilayer Neural Networks
- Unit 4.3Training a Multilayer Neural Network in PyTorch
- Unit 4.4Defining Efficient Data Loaders
- Unit 4.5Multilayer Neural Networks for Regression
- Unit 4.6Speeding Up Model Training Using GPUs
- Unit 4 ExercisesUnit 4 Exercises
- Unit 5Organizing Your Code with Lightning
- Unit 5.1 Organizing Your Code with Lightning
- Unit 5.2Training a Multilayer Perceptron using the Lightning Trainer
- Unit 5.3Computing Metrics Efficiently with TorchMetrics
- Unit 5.4Making Code Reproducible
- Unit 5.5Organizing Your Data Loaders with Data Modules
- Unit 5.6The Benefits of Logging Your Model Training
- Unit 5.7Evaluating and Using Models on New Data
- Unit 5.8Add Functionality with Callbacks
- Unit 5 ExercisesUnit 5 Exercises
- Unit 6Essential Deep Learning Tips & Tricks
- Unit 6.1 Model Checkpointing and Early Stopping
- Unit 6.2Learning Rates and Learning Rate Schedulers
- Unit 6.3Using More Advanced Optimization Algorithms
- Unit 6.4Choosing Activation Functions
- Unit 6.5Automating The Hyperparameter Tuning Process
- Unit 6.6Improving Convergence with Batch Normalization
- Unit 6.7Reducing Overfitting With Dropout
- Unit 6.8Debugging Deep Neural Networks
- Unit 6 ExercisesUnit 6 Exercises
- Unit 7Getting Started with Computer Vision
- Unit 7.1Working With Images
- Unit 7.2How Convolutional Neural Networks Work
- Unit 7.3Convolutional Neural Network Architectures
- Unit 7.4Training Convolutional Neural Networks
- Unit 7.5Improving Predictions with Data Augmentation
- Unit 7.6Leveraging Pretrained Models with Transfer Learning
- Unit 7.7Using Unlabeled Data with Self-Supervised
- Unit 7 ExercisesUnit 7 Exercises
- Unit 8Natural Language Processing and Large Language Models
- Unit 8.1Working with Text Data
- Unit 8.2Training A Text Classifier Baseline
- Unit 8.3Introduction to Recurrent Neural Networks
- Unit 8.4From RNNs to the Transformer Architecture
- Unit 8.5Understanding Self-Attention
- Unit 8.6Large Language Models
- Unit 8.7A Large Language Model for Classification
- Unit 8 ExercisesUnit 8 Exercises
- Unit 9Techniques for Speeding Up Model Training
- Unit 10 The Finale: Our Next Steps After AI Model Training
8.5 Understanding Self-Attention
Slides
References
- Attention Is All You Need (2018) — the original transformer paper
What we covered in this video lecture
To understand large language transformers, it is essential to understand self-attention, which is the underlying mechanism that powers these models: self-attention can be understood as a way to create context-aware text embedding vectors.
In this lecture, we explain self-attention from the ground up. We are starting with a simple parameter-free version of self-attention to explain the underlying principles. Then, we cover the parameterized self-attention mechanism used in transformers: self-attention with learnable weights.
Additional resources if you want to learn more
This lecture introduced the attention mechanism with conceptual illustration. If you prefer a coding-based approach, also check out my article Understanding and Coding the Self-Attention Mechanism of Large Language Models From Scratch.
Log in or create a free Lightning.ai account to access:
- Quizzes
- Completion badges
- Progress tracking
- Additional downloadable content
- Additional AI education resources
- Notifications when new units are released
- Free cloud computing credits
Quiz: 8.5 Understanding Self-Attention (Part 1)
Quiz: 8.5 Understanding Self-Attention (Part 2)
Quiz: 8.5 Understanding Self-Attention (Part 3)
Quiz: 8.5 Understanding Self-Attention (Part 4)
Watch Video 1 Mark complete and go to Unit 8.6 →
Unit 8.5