I have a training loop where epochs are very fast. Training and validation happens on the order of seconds.
However, there is a long pause after the epoch end. Why, and how can I disable this so I can train faster?
My guess was that checkpointing or logging is happening, but I poked around and couldn’t determine if that was the case or how to decrease its frequency.