Lightning AI Studios: Never set up a local environment again →

Log in or create a free Lightning.ai account to track your progress and access additional course materials  

6.8 Debugging Deep Neural Networks

Code

What we covered in this video lecture

This lecture covered three simple approaches for debugging deep neural network training. First, we discussed that doing a fast dev run is often a good idea before initiating an expensive training procedure. This helps us to test whether everything is set up correctly quickly.

Next, we discussed looking at model summaries to better understand whether the layers are connected as we intended. It can also give us useful information on the number of parameters and model sizes at a glance.

Lastly, we discussed batch overfitting. Neural networks are great overfitters if we let them. Or in other words, a neural network should always reach 90-100% accuracy when we train it on a single batch. This is a quick and easy diagnostic for determining whether we implemented everything correctly before moving on to the more expensive training procedure on the full dataset.

Additional resources if you want to learn more

You might enjoy this article by on Debugging in PyTorch where the author mentions common mistakes such as messing up the loss function choice or embedding dimensions among others.

Log in or create a free Lightning.ai account to access:

  • Quizzes
  • Completion badges
  • Progress tracking
  • Additional downloadable content
  • Additional AI education resources
  • Notifications when new units are released
  • Free cloud computing credits

Quiz: 6.8 Debugging Deep Neural Networks - Part 1

The fast_dev_run option makes the notebook or script run faster because

Correct. Logging will be disabled (however, logging usually has a negligible overhead).

Correct. Checkpointing will be disabled (however, checkpointing usually has a negligible overhead).

Correct. If we set fast_dev_run=5, it will only run 5 minibatches for example.

Please answer all questions to proceed.

Quiz: 6.8 Debugging Deep Neural Networks - Part 2

A model summary gives us useful clues about the accuracy and runtime of a model.

Incorrect. The model summary gives us an idea of the architecture, not the accuracy and runtime, which are dataset-specific as well.

Correct. Although, the size of the network can maybe help us to get an idea of what the memory requirements might be.

Please answer all questions to proceed.

Quiz: 6.8 Debugging Deep Neural Networks - Part 3

If we achieve 100% training accuracy via batch overfitting, this means that there is a bug in the network.

Incorrect. We want to achieve 100% training accuracy because we want to show that the network is able to learn well and memorize when there is only a very small dataset.

Correct. We want to achieve 100% training accuracy because we want to show that the network is able to learn well and memorize when there is only a very small dataset.

Please answer all questions to proceed.
Watch Video 1

Unit 6.8

Videos