Is it possible to use a single Trainer to train multiple versions of the same model in parallel?

desktable · July 17, 2023, 8:57am

Context:

I would like to train several models at the same time. These models share the same structure but may differ in initialization or some options.
It is expensive for the data loader to prepare a batch of data. Once a batch of data is prepared, it’s desirable to reuse it to train several models in parallel.
The model is small enough; it’s easy to fit several of these models into a single GPU.

Topic		Replies	Views
Training multiple model replicas on different GPUs	0	188	December 5, 2023
Multiple dataloaders in training_step() and use them separately implementation help	0	374	September 13, 2023
Multiple data loader get stuck at epoch 1	0	1096	July 14, 2021
How trainer.test/predict works when 2 devices are used? Trainer	0	144	March 24, 2024
How to train PyTorch on multiple GPUs DDP/GPU	1	600	August 27, 2020