I’m trying to understand the trainer.test
function in Pytorch Lightning. Right now I can do
trainer.test(model=model, datamodule=datamodule)
image_AUROC 1.0
image_BinaryAccuracy 0.923
...
For a batch of samples in datamodule
, it computes the metric scores for all samples and return a mean value. But I want to get score of each sample separeately. Is it possible out-of the box from API?
I am thinking to loop over the datamodule
and call model on each sample and use model.metric
to compute the evaluation. But I am not certain how model is programmed to behave inside trainer.test
function. I went through the src code but not easy to figure it out. For example, should I use model.eval
or use with torch.no_grad
context?
model.eval()
with torch.no_grad():
for batch in datamodule.test_dataloader():
map2d, logit = model(batch['image']) # [1, 1, 224, 224]
eval_metrics = model.image_metrics(batch['gt'], logit)
....