In my model, I have a sub-module that contains some non-learnable tensors that are pre-calculated. Without wrapping them by nn.Parameter, I can’t get pytorch-lightning to automatically move them to GPU. However, if I don’t want them to learn anything I have to set requires_grad=False, which also cuts the gradient at the point. How should I handle this implementation? Thanks.
Use buffers:
https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.register_buffer
this will move the tensors automatically. It’s just PyTorch, so it will work with Lightning.