When training in GPU the model does not decrease the loss, in CPU it does

Thank you teddy, great