Visual Information Fidelity (VIF)¶
Module Interface¶
- class torchmetrics.image.VisualInformationFidelity(sigma_n_sq=2.0, reduction='mean', **kwargs)[source]¶
Compute Pixel Based Visual Information Fidelity (VIF).
As input to
forward
andupdate
the metric accepts the following inputpreds
(Tensor
): Predictions from model of shape(N,C,H,W)
with H,W ≥ 41target
(Tensor
): Ground truth values of shape(N,C,H,W)
with H,W ≥ 41
As output of forward and compute the metric returns the following output
vif-p
(Tensor
):If
reduction='mean'
(default), returns a Tensor mean VIF score.If
reduction='none'
, returns a tensor of shape(N,)
with VIF values per sample.
- Parameters:
reduction¶ (
Literal
['mean'
,'none'
]) –The reduction method for aggregating scores.
'mean'
: return the average VIF across the batch.'none'
: return a VIF score for each sample in the batch.
kwargs¶ (
Any
) – Additional keyword arguments, see Advanced metric settings for more info.
Example
>>> from torch import randn >>> from torchmetrics.image import VisualInformationFidelity >>> preds = randn([32, 3, 41, 41], generator=torch.Generator().manual_seed(42)) >>> target = randn([32, 3, 41, 41], generator=torch.Generator().manual_seed(43)) >>> vif_mean = VisualInformationFidelity(reduction='mean') >>> vif_mean(preds, target) tensor(0.0032) >>> vif_none = VisualInformationFidelity(reduction='none') >>> vif_none(preds, target) tensor([0.0040, 0.0049, 0.0017, 0.0039, 0.0041, 0.0043, 0.0030, 0.0028, 0.0012, 0.0067, 0.0010, 0.0014, 0.0030, 0.0048, 0.0050, 0.0038, 0.0037, 0.0025, 0.0041, 0.0019, 0.0007, 0.0034, 0.0037, 0.0016, 0.0026, 0.0021, 0.0038, 0.0033, 0.0031, 0.0020, 0.0036, 0.0057])
Functional Interface¶
- torchmetrics.functional.image.visual_information_fidelity(preds, target, sigma_n_sq=2.0, reduction='mean')[source]¶
Compute Pixel-Based Visual Information Fidelity (VIF-P).
VIF is a full-reference metric that measures the amount of visual information preserved in a distorted image compared to the reference image.
- Parameters:
preds¶ (
Tensor
) – Predicted images of shape (N, C, H, W). Height and width must be at least 41.target¶ (
Tensor
) – Ground truth images of shape (N, C, H, W). Must match preds in shape.sigma_n_sq¶ (
float
) – Variance of the visual noise. Default: 2.0.reduction¶ (
Literal
['mean'
,'none'
]) – Method for reducing the metric across the batch. - “mean”: Return a tensor average over the batch. - “none”: Return a VIF score for each sample as a 1D tensor of shape (N,).
- Returns:
- VIF score(s). The shape depends on the reduction argument:
If
reduction="mean"
, returns a scalar tensor.If
reduction="none"
, returns a tensor of shape(N,)
.
- Return type:
- Raises:
ValueError – If input dimensions are smaller than
41x41
.ValueError – If
preds
andtarget
shapes don’t match.ValueError – If
reduction
is not"mean"
or"none"
.
Example
>>> from torchmetrics.functional.image import visual_information_fidelity >>> preds = torch.randn(4, 3, 41, 41, generator=torch.Generator().manual_seed(42)) >>> target = torch.randn(4, 3, 41, 41, generator=torch.Generator().manual_seed(43)) >>> visual_information_fidelity(preds, target, reduction="none") tensor([0.0040, 0.0049, 0.0017, 0.0039])