Memory leak after the first validation epoch

I’m facing a very annoying problem with Pytorch Lightning. I’ve been struggling with this for two days and I can’t see a way out. I’ve read all possible posts but couldn’t find a solution,therefore I’m hoping in your precious help.

The problem I’m facing is the following: by using “free -h” command, I see used memory increasing of 0.4 GB after the first validation epoch, right after that the logging has been printed to the console. This memory is not released after the program terminates and an out of memory error occurs after many training attempts.

I have no insight on how to fix this. I tried commenting the logging process, tried various memory trackers (like tracemalloc) but had no clear clue on what’s going on.

You can find the code GitHub - TunguskaMed/ML_EX2 . Unfortunately I haven’t written a readme yet, but from the main.py you can launch a training with hyperparameters combinations set in utils.py. Inputs are 96x96 colored images.

Thanks in advace to everyone who will try to help me! It is important as it is for an university project.