The checkpoint path will be whatever specified by the ModelCheckpoint callback. By default this will be lightning_logs/version_{version number}/epoch_{epoch number}.ckpt.
Hi, I need to define a checkpoint which is called 5 times during the training, how would I know inside the ModelCheckpoint, which iteration number this is ? thanks I appreciate an example, on how to save the model every k steps/epochs
In addition to what @goku said, you can get the log directory + version number with: trainer.logger.log_dir, so if you add what you want as a callback:
from pytorch_lightning.callbacks import Callback
class OnCheckpointSomething(Callback):
def on_save_checkpoint(self, trainer, pl_module):
save_path = f"{trainer.logger.log_dir}/checkpoints/epoch={trainer.current_epoch}.ckpt"
Also ModelCheckpoint has a method called format_checkpoint_name that is actually called when saving checkpoints and does the overall formatting. The callback itself can be accessed by trainer.checkpoint_callback
As an example, if you want to save the weights of your model before training, you can add the following hook to your LightningModule:
I believe this still does not answer the original question.
When on_save_checkpoint is called, how do I tell if the checkpoint will be saved in self.best_model_path or self.last_model_path?
(In the very common case where we are saving both the best and the last model)
best model and last model path would be different if your best model is not your last model.
Model checkpoint callback will save the models in a folder like this - my/path/epoch=0-step=10.ckpt.
Once your training is completed you can access the location of best model and last model using the attributes