Computing gradients wrt inputs within training_step

vmirly · November 25, 2022, 4:37am

I am trying to implement a WGAN-GP (gradient penalty) by defining the following function which is called inside training_step() method.

def wgan_gradient_penalty(
        real: torch.Tensor,
        fake: torch.Tensor,
        discriminator: torch.nn.Module) -> torch.Tensor:

    alpha = torch.rand(real.size(0), 1, 1, 1).type_as(real)

    x_hat = alpha * real + (1 - alpha) * fake.detach()
    x_hat.requires_grad = True

    # calc. d_hat: discriminator output on x_hat
    d_hat = discriminator(x_hat)

    # calc. gradients of d_hat vs. x_hat
    grads = torch.autograd.grad(
        outputs=d_hat,
        inputs=x_hat,
        grad_outputs=torch.ones(d_hat.size()).type_as(real),
        create_graph=True,
        retain_graph=True)[0]

But it seems that the output of the network does is detached from the network. When I check d_hat.grad_fn which is None.

(Pdb) print(d_hat.grad_fn)
None

and therefore, the grad_fn is not defined, and it results in the following error:

*** RuntimeError: element 0 of tensors does not require grad and does not have a grad_fn

awaelchli · November 27, 2022, 9:10pm

I think you are looking for x_hat.grad which holds the value of the gradient of d_hat w.r.t. x_hat.

Topic		Replies	Views
How would you perform gradient penalty loss in Fabric? Similar to WGAN-GP Fabric	0	22	August 5, 2024
Why does training fails with "require grad and does not have a grad_fn"? LightningModule	3	5331	August 8, 2023
Torch.no_grad() calls implementations	4	3951	August 2, 2023
`.detach()` cannot stop backprop in `training_step` LightningModule	4	3000	January 21, 2021
GAN: freeze models or use toggle implementation help	0	18	August 26, 2024

Computing gradients wrt inputs within training_step

Related topics