I am testing a model with lightning, it has been working fine with 1 GPU. After added 2nd GPU today however, the following error happened:
(with gpus=2, distributed_backend=‘ddp’ been added to pl.Trainer )
raise RuntimeError(“No rendezvous handler for {}://”.format(result.scheme))
RuntimeError: No rendezvous handler for env://
I am on Windows 10, PyTorch 1.7.1, pytorch_lightning 1.1.4, cuda 11.0
how should I fix or work around this problem?
Thanks!