Lightning AI Studios: Never set up a local environment again →

Log in or create a free Lightning.ai account to track your progress and access additional course materials  

Unit 6.1 – Model Checkpointing and Early Stopping

Code

What we covered in this video lecture

In this lecture, we will explore the concept of early stopping. Previously, we trained a neural network for a given number of epochs and then kept only the last model, assuming the last model corresponds to the “best trained” one. However, if our model starts overfitting, it’s typically not the last model that is the best model.

In this lecture, we will monitor the validation set accuracy during training and create model checkpoints of the best model during training. The best model checkpoint — the model with the highest validation set accuracy — is then selected for test set evaluation. Note that even though we call this concept “early stopping,” we are not literally stopping the model training early. Instead, we are training the model for the same number of epochs as we usually would, but instead of sticking to the last model, we select the “best” model.

Additional resources if you want to learn more

If you are interested in literally stopping the training early, check out the [EarlyStopping documentation](https://pytorch-lightning.readthedocs.io/en/stable/common/early_stopping.html. The EarlyStopping callback stops the training early given a certain patience parameter — i.e., a patience of 5 could be set such that the training stops if the predictive performance doesn’t improve within a span of 5 epochs.

Log in or create a free Lightning.ai account to access:

  • Quizzes
  • Completion badges
  • Progress tracking
  • Additional downloadable content
  • Additional AI education resources
  • Notifications when new units are released
  • Free cloud computing credits

Quiz: 6.1 Model Checkpointing and Early Stopping - Part 1

Using the ModelCheckPoint, we can save the best model during training based on the

Correct. We would specify “monitor=”train_loss” for that

Correct. We would specify “monitor=”val_loss” for that

Correct. We would specify “monitor=”train_acc” for that

Correct. We would specify “monitor=”val_acc” for that

Please answer all questions to proceed.

Quiz: 6.1 Model Checkpointing and Early Stopping - Part 2

Suppose we have a binary classification dataset with 731 data points from class 0 and 269 data points from class 1. What is the expected classification accuracy if our classifier makes totally random predictions?

Incorrect. We would expect 73.1% only if we used a majority class (zero rule) classifier.

Incorrect. We would expect 26.9% if we used a minority class classifier.

Correct. If the classifier makes truly random predictions, it returns either a 0 or a 1 for each data point with a fifty-fifty chance.

Incorrect. The classifier surely will get more than 0 data points right just by chance.

Incorrect. We should be able to come up with a good guess here.

Please answer all questions to proceed.
Watch Video 1

Unit 6.1

Videos