When I use num_workers =0 for train_dataloader, val_dataloader, test_dataloader, the training finishes one epoch %100 quickly (although I get loss = NaN and I have not figure out what the issue is) with some warning that I should use larger num_workers and it suggests me to use num_workers = 16.
However, if I use num_workers > 0 it gets stuck at the validation sanity check and it does not go anywhere.
Can someone please shed some light on what the issue might be ? Thank you.