Shortcuts

TrainingEpochLoop

class pytorch_lightning.loops.epoch.TrainingEpochLoop(min_steps=0, max_steps=- 1)[source]

Bases: pytorch_lightning.loops.base.Loop[List[List[Union[Dict[int, Dict[str, Any]], Dict[str, Any]]]]]

Runs over all batches in a dataloader (one epoch).

Parameters
  • min_steps (Optional[int]) – The minimum number of steps (batches) to process

  • max_steps (int) – The maximum number of steps (batches) to process

advance(*args, **kwargs)[source]

Runs a single training batch.

Parameters

dataloader_iter – the iterator over the dataloader producing the new batch

Raises

StopIteration – When the epoch is canceled by the user returning -1

Return type

None

connect(batch_loop=None, val_loop=None)[source]

Optionally connect a custom batch or validation loop to this training epoch loop.

Return type

None

on_advance_end()[source]

Runs validation and Checkpointing if necessary.

Raises

StopIteration – if done evaluates to True to finish this epoch

on_load_checkpoint(state_dict)[source]

Called when loading a model checkpoint, use to reload loop state.

Return type

None

on_run_end()[source]

Calls the on_epoch_end hook.

Return type

None

Returns

The output of each training step for each optimizer

Raises

MisconfigurationExceptiontrain_epoch_end does not return None

on_run_start(data_fetcher, **kwargs)[source]

Hook to be called as the first thing after entering run (except the state reset).

Accepts all arguments passed to run.

Return type

None

on_save_checkpoint()[source]

Called when saving a model checkpoint, use to persist loop state.

Return type

Dict

Returns

The current loop state.

reset()[source]

Resets the internal state of the loop for a new run.

Return type

None

teardown()[source]

Use to release memory etc.

Return type

None

update_lr_schedulers(interval, update_plateau_schedulers)[source]

updates the lr schedulers based on the given interval.

Return type

None

property batch_idx: int

Returns the current batch index (within this epoch)

property done: bool

Returns whether the training should be stopped.

The criteria are that the number of steps reached the max steps, the last batch is reached or the trainer signals to stop (e.g. by early stopping).

property total_batch_idx: int

Returns the current batch index (across epochs)