Is it possible to use a single Trainer to train multiple versions of the same model in parallel?

Context:

  1. I would like to train several models at the same time. These models share the same structure but may differ in initialization or some options.
  2. It is expensive for the data loader to prepare a batch of data. Once a batch of data is prepared, it’s desirable to reuse it to train several models in parallel.
  3. The model is small enough; it’s easy to fit several of these models into a single GPU.