Is automatic optimization can catch nested requires_grad?

KyonP · March 3, 2023, 2:51pm

I am trying to non-PL code into my PL-based code. the code is as follows,

from lora_diffusion import inject_trainable_lora, extract_lora_ups_down

unet = UNet2DConditionModel.from_pretrained(
    pretrained_model_name_or_path,
    subfolder="unet",
)
unet.requires_grad_(False)
unet_lora_params, train_names = inject_trainable_lora(unet)  # This will
# turn off all of the gradients of unet, except for the trainable LoRA params.
optimizer = optim.Adam(
    itertools.chain(*unet_lora_params, text_encoder.parameters()), lr=1e-4
)

In short, I have to train a part of “unet” by “inject_trainable_lora,” which consists of the following “requires_grad” modules.

_module._modules[name].lora_up.weight.requires_grad = True _module._modules[name].lora_down.weight.requires_grad = True

I am using the same optimizer (“optim.Adam”), currently quoted out those lines; however, in this case, does “self.parameters()” in “configure_optimizers” catch those partial modules automatically?

And also, for this purpose, how can I figure out whether specific modules are updated or not?

Any suggestions and advice would be constructive.

justusschock · March 4, 2023, 5:45pm

Hey, the self.parameters in configure_optimizers work as any raw nn.Module.parameters() (since it actually is the same ).

Meaning that if your submodules with the required parameters are added to the model when configure_optimizers is called, they will be optimized, otherwise they won’t.

The easiest way to check is to either print parameters inside the hook or debug in the hook and verify manually.

Topic		Replies	Views
RuntimeError When Integrating LoRA Layers Trainer	1	505	March 1, 2024
Easily skipping optimizers for modular networks implementation help	4	1100	September 7, 2020
The computation graph is breaking in the outer loop of meta-learning, Meta gradients are None. FOMAML implementation help	0	434	November 23, 2023
Self.parameters() or self.model.parameters() implementations	1	5883	September 2, 2020
Why does training fails with "require grad and does not have a grad_fn"? LightningModule	3	5336	August 8, 2023

Is automatic optimization can catch nested requires_grad?

Related topics