Lightning AI Studios: Never set up a local environment again →

Log in or create a free Lightning.ai account to track your progress and access additional course materials  

Unit 9.1 Accelerated Model Training via Mixed-Precision Training

References

Code

What we covered in this video lecture

In this lecture, we delve into the concept of mixed-precision training in deep learning, which involves using a combination of different numerical precisions (typically float32 and float16 or bfloat16) during model training to improve computational efficiency and speed.

Traditional training methods tend to use 32-bit floating-point numbers (float32) to represent weights, biases, activations, and gradients for neural networks. However, this can be computationally expensive and memory-intensive, particularly for large models and data sets. To address this, mixed-precision training employs lower-precision formats, namely 16-bit floating-point numbers (float16) and Brain Floating Point (bfloat16), in parts of the training process where higher precision is not critical.

The balance between speed, memory usage, and precision makes mixed-precision training an increasingly popular approach for training large-scale machine learning models.

Additional resources if you want to learn more

If you want to learn more details about mixed precision training, including benchmarks, check out my blog article Accelerating Large Language Models with Mixed-Precision Techniques.

Furthermore, you might be interested in the related topic of quantization. An interesting research article on this topic is LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale.

Log in or create a free Lightning.ai account to access:

  • Quizzes
  • Completion badges
  • Progress tracking
  • Additional downloadable content
  • Additional AI education resources
  • Notifications when new units are released
  • Free cloud computing credits

Quiz: 9.1 Accelerated Model Training via Mixed-Precision Training (Part 1)

What are min and maximum values for float32 and float16 (not explicitely mentioned in the lecture). You can use torch.finfo to find out.

Correct. The min and max values for float32 are minus/plus three hundred forty undecillion two hundred eighty-two decillion

Incorrect. The min/max values for float32 are higher than that.

Incorrect. The min/max values for float32 are much higher than that.

Incorrect. The min/max values for float32 are much much higher than that.

Please answer all questions to proceed.

Quiz: 9.1 Accelerated Model Training via Mixed-Precision Training (Part 2)

On a GPU with tensor cores that support the bfloat16 type, neural networks with bfloat16 weights train …

Incorrect. The difference is more in precision and dynamic range, not speed.

Correct. The difference is more in precision and dynamic range, not speed.

Incorrect. The difference is more in precision and dynamic range, not speed.

Please answer all questions to proceed.
Watch Video 1

Unit 9.1

Videos