Optimizer step in Profiler

maxskann · May 6, 2024, 3:06pm

Hello there.
While debugging the model, I noticed optimizer_step takes more time than the inference and backward step using profiler=“simple”. Then, using PytorchProfiler and tensor board trace, I found out that this is because the beginning of optimizer_step coincides with the beginning of run_training_batch and ends almost immediately after the end of the backward step. Please tell me if this is expected behavior, or how I can fix it?

Topic		Replies	Views
How to interpret simple profiler results?	1	73	September 4, 2024
Confusing # of optimizer steps when using gradient accumulation with DeepSpeed Trainer	0	856	May 25, 2023
Tuner: Detected call of lr_scheduler.step() before optimizer.step() implementation help	1	571	May 27, 2024
Strange checkpoint loading and learning and behaviour Trainer	3	2316	December 17, 2020
How to step the optimizer twice inside one training loop? implementation help	1	939	January 11, 2023

Optimizer step in Profiler

Related topics