DeviceStatsMonitor

class lightning.pytorch.callbacks.DeviceStatsMonitor(cpu_stats=None)[source]

Bases: Callback

Automatically monitors and logs device stats during training, validation and testing stage. DeviceStatsMonitor is a special callback as it requires a logger to passed as argument to the Trainer.

Parameters:

cpu_stats (Optional[bool]) – if None, it will log CPU stats only if the accelerator is CPU. If True, it will log CPU stats regardless of the accelerator. If False, it will not log CPU stats regardless of the accelerator.

Raises:

MisconfigurationException – If Trainer has no logger.
ModuleNotFoundError – If psutil is not installed and CPU stats are monitored.

Example:

from lightning import Trainer
from lightning.pytorch.callbacks import DeviceStatsMonitor
device_stats = DeviceStatsMonitor()
trainer = Trainer(callbacks=[device_stats])

on_test_batch_end(trainer, pl_module, outputs, batch, batch_idx, dataloader_idx=0)[source]

Called when the test batch ends.

Return type:: None

on_test_batch_start(trainer, pl_module, batch, batch_idx, dataloader_idx=0)[source]

Called when the test batch begins.

Return type:: None

on_train_batch_end(trainer, pl_module, outputs, batch, batch_idx)[source]: Called when the train batch ends. :rtype: None

Note

The value outputs["loss"] here will be the normalized value w.r.t accumulate_grad_batches of the loss returned from training_step.

on_train_batch_start(trainer, pl_module, batch, batch_idx)[source]

Called when the train batch begins.

Return type:: None

on_validation_batch_end(trainer, pl_module, outputs, batch, batch_idx, dataloader_idx=0)[source]

Called when the validation batch ends.

Return type:: None

on_validation_batch_start(trainer, pl_module, batch, batch_idx, dataloader_idx=0)[source]

Called when the validation batch begins.

Return type:: None

setup(trainer, pl_module, stage)[source]

Called when fit, validate, test, predict, or tune begins.

Return type:: None