Am I not validating and testing my data correctly?

Is your testing dataset and validation dataset the same?

Yes: You have a bug in the code. Check your precision/acc/f1 implementation. Or use a reference like torchmetrics to make sure this part is correct.

No: Even if your implementation is correct, this can still happen. Your validation set may be close to the training dataset and you might be overfitting, leading to a high test loss / low test acc.

I noticed you are hard-coding the batch size to 64 when logging. This can be error-prone. You need to make sure that you have batch_size=64 and drop_last=True. I suggest to drop the hardcoded value there so Lightning can infer the batch size directly.