Lightning AI Studios: Never set up a local environment again →

Log in or create a free Lightning.ai account to track your progress and access additional course materials  

5.5 Organizing Your Data Loaders with Data Modules

References

Code

What we covered in this video lecture

In this lecture, we introduced LightningDataModule as an additional organizational layer for our DataLoaders, adding extra convenience when using the Trainer. However, it is still possible to use the DataLoaders separately as before.

Regarding multi-GPU computing (Unit 9), data modules have the advantage that we can use prepare_data and setup methods separately. The prepare_data method is called only within a single process on CPU — this is useful because downloading and saving data with multiple processes (distributed settings) can result in corrupted data. Then we can use setup for data operations we might want to perform on every GPU, for example, partitioning the dataset into train/val/test splits and applying data transforms (we will talk more about data transforms and augmentation in Unit 7).

Additional resources if you want to learn more

If you are interested in additional details about the LightningDataModule you can browse through the more technical LightningDataModule documentation. However, at this stage, you learned all you need to use LightningDataModules, so please feel free to skip the documentation for now.

Log in or create a free Lightning.ai account to access:

  • Quizzes
  • Completion badges
  • Progress tracking
  • Additional downloadable content
  • Additional AI education resources
  • Notifications when new units are released
  • Free cloud computing credits

Quiz: 5.5 Organizing Your Data Loaders with Data Modules (Part 1)

Check all the true statements concerning LightningDataModules:

Correct. We usually define PyTorch DataLoaders inside the LightningDataModule.

Incorrect. LightningDataModule are an organizational layer, so they don’t speed up the data loading process itself.

Correct. They are an organizational layer that helps us package data-loading code in a single class.

Please answer all questions to proceed.

Quiz: 5.5 Organizing Your Data Loaders with Data Modules (Part 2)

Which of the following is not a valid way to compute the validation set accuracy?

Correct. This is the recommended way to calculate the validation accuracy.

Correct. Since we have a LightningDataModule defined, there is a more elegant way to get the validation set accuracy, but this also works.

Correct. Since we have a LightningDataModule defined, there is a more elegant way to get the validation set accuracy, but this also works.

Please answer all questions to proceed.
Watch Video 1

Unit 5.5

Videos
Follow along in a Lightning Studio

DL Fundamentals 5: PyTorch Lightning

Sebastian
Launch Studio →
Questions or Feedback?

Join the Discussion