Hi,
I am developing an architecture with a “side” calculation that should not be back-propagated through (should be excluded from the gradient). It is not differentiable, but its output is needed to continue the forward pass.
I tried using with torch.no_grad()
before it, (in the forward method), but got an error:
with torch.no_grad:
AttributeError: __enter__
I tried simply leaving it as it is (non-differentiable parts should be “taken care of” by pytorch according to this: https://discuss.pytorch.org/t/performing-backward-on-a-network-with-non-differentiable-module/7331). But I get this CUDA error:
CUDA kernel errors might be asynchronously reported at some other API call,so the stacktrace below might be incorrect.
For debugging consider passing CUDA_LAUNCH_BLOCKING=1.
/pytorch/aten/src/ATen/native/cuda/ScatterGatherKernel.cu:111: operator(): block: [0,0,0], thread: [0,0,0] Assertion `idx_dim >= 0 && idx_dim < index_size && "index out of bounds"` failed.
Help would be appreciated, thanks