If I use the normal data loader for getting the training data loaded into the trainer.fit() routine, everything works fine. (validation step after each epoch)
However, when I create a custom batch sampler (pulling even amount of events from each class), inside the the trainer loop, only the training_step gets executed (behaviour here seems as expected).
The validation step then gets only executed in the initial validation check.
In the iter method, I create for each step_per_epoch (int, definded by me) an array of indices, which gets returned by yield array here (last line of previous link, I can only put 2 links in my post).
The model is defined here:
(Same file as Batch Sampler, starting in line 49.)
The trainer gets called here:
I know that the code is nested and embedded in luigi, so it might be difficult to read at some points.
If you have any questions, or need more information, I am happy to make my problem easier to understand.
from a quick look, I don’t think you are using the BatchSampler for the validation dataloader.
We have moved the discussions to GitHub Discussions. You might want to check that out instead to get a quick response. The forums will be marked read-only after some time.