Model Works on CPU but Error out while running on GPU

Hi,
I have a pytorch lightning model where the training is working fine on cpu but when I run the same training on a gpu machine it throws an error on trainer.fit(model)

AssertionError: Gather function not implemented for CPU tensors

What could be the possible issue here.

I use ‘accelerator’: ‘gpu’, ‘strategy’: ‘dp’

The machine has 2 gpus and this is ran on databricks cluster.

Hey, Can you post your model code?

Generally, DP is considered deprecated (and thus removed from 2.0) as it has a lots of caveats. I’d recommend you to try DDP instead. Which lightning version are you using?

Cheers,
Justus

1 Like