How to best use "fixed" models?

PiotrDabkowski · April 8, 2022, 12:56pm

I have a pretrained torch.nn.Module that my LightningModule uses for training.

For the purpose of example assume it is a pretrained & fixed ResNet image model, that I use for feature generation.

How can I best use such a module from my LightningModule?

Option 1:
Simply storing it as a child module:

self.resnet = ResNet()

Would result in its parameters being stored as part of the LightningModule and increase the checkpoint size. Also, this approach prevents a single ResNet model from being shared by multiple modules. This is a big issue for huge models.

Option 2

Pass the pretrained model as a parameter

class MyModel(LightningModule):
    def __init__(self, resnet: ResNet):
       self._resnet = [resnet]

With this approach the ResNet model is not really owned by the LightningModule, and simply stored as a reference. It allows model sharing and does not store it inside the checkpoint. But the problem is the device management. I need to manually mode the resnet via .cuda() and the problem is even greater when training on multiple GPUs.

Better Option?

Is there a better option - that stores the model as an attribute for automatic device management, but that does not manage weights?

Thank you!

Topic		Replies	Views
Using custom pretrained model in a lightning module	0	272	April 22, 2023
In PyTorch Lightning, how can one extract embeddings from a pretrained model to assist another model during training_step? Trainer	1	276	March 25, 2024
What is the proper way to train a model, save it and then test it, avoiding information leakage and guaranteeing reproducibility? DataModule	2	176	March 6, 2024
Load checkpoint when passing a model LightningModule	4	2418	March 12, 2023
Best way to use load_from_checkpoint when model contains other models implementation help	1	2364	December 20, 2022

How to best use "fixed" models?

Related topics