Is automatic optimization can catch nested requires_grad?

I am trying to non-PL code into my PL-based code. the code is as follows,

from lora_diffusion import inject_trainable_lora, extract_lora_ups_down

unet = UNet2DConditionModel.from_pretrained(
    pretrained_model_name_or_path,
    subfolder="unet",
)
unet.requires_grad_(False)
unet_lora_params, train_names = inject_trainable_lora(unet)  # This will
# turn off all of the gradients of unet, except for the trainable LoRA params.
optimizer = optim.Adam(
    itertools.chain(*unet_lora_params, text_encoder.parameters()), lr=1e-4
)

In short, I have to train a part of “unet” by “inject_trainable_lora,” which consists of the following “requires_grad” modules.

_module._modules[name].lora_up.weight.requires_grad = True _module._modules[name].lora_down.weight.requires_grad = True

I am using the same optimizer (“optim.Adam”), currently quoted out those lines; however, in this case, does “self.parameters()” in “configure_optimizers” catch those partial modules automatically?

And also, for this purpose, how can I figure out whether specific modules are updated or not?

Any suggestions and advice would be constructive.

Hey, the self.parameters in configure_optimizers work as any raw nn.Module.parameters() (since it actually is the same :smiley: ).

Meaning that if your submodules with the required parameters are added to the model when configure_optimizers is called, they will be optimized, otherwise they won’t.

The easiest way to check is to either print parameters inside the hook or debug in the hook and verify manually.

1 Like