Excluding a side calculation in forward from gradient

SapirWeissbuch · November 19, 2021, 7:19pm

Hi,

I am developing an architecture with a “side” calculation that should not be back-propagated through (should be excluded from the gradient). It is not differentiable, but its output is needed to continue the forward pass.
I tried using with torch.no_grad() before it, (in the forward method), but got an error:

    with torch.no_grad:
AttributeError: __enter__

I tried simply leaving it as it is (non-differentiable parts should be “taken care of” by pytorch according to this: https://discuss.pytorch.org/t/performing-backward-on-a-network-with-non-differentiable-module/7331). But I get this CUDA error:

CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [0,0,0], thread: [0,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.

Help would be appreciated, thanks

Topic		Replies	Views
RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation LightningModule	0	595	December 9, 2023
Is it possible to return gradient instead of loss in train_step LightningModule	0	433	July 18, 2022
Lightning giving out of CUDA error implementation help	3	1444	April 4, 2022
Torch.no_grad() calls implementations	4	3996	August 2, 2023
How to handle non-learnable params implementation help	1	1724	December 16, 2020

Excluding a side calculation in forward from gradient

Related topics