# Changelog¶

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

**Note: we move fast, but still we preserve 0.1 version (one feature release) back compatibility.**

## [UnReleased] - 2022-MM-DD¶

### [UnReleased] - Added¶

### [UnReleased] - Changed¶

Calculate text color of ConfusionMatrix plot based on luminance

### [UnReleased] - Removed¶

### [UnReleased] - Fixed¶

Fixed bug in

`MetricCollection`

when using compute groups and`compute`

is called more than once (#2571)Fixed class order of

`panoptic_quality(..., return_per_class=True)`

output (#2548)Fixed

`BootstrapWrapper`

not being reset correctly (#2574)Fixed integration between

`ClasswiseWrapper`

and`MetricCollection`

with custom`_filter_kwargs`

method (#2575)

## [1.4.0] - 2024-05-03¶

### [1.4.0] - Added¶

Added

`SensitivityAtSpecificity`

metric to classification subpackage (#2217)Added

`QualityWithNoReference`

metric to image subpackage (#2288)Added a new segmentation metric:

Added support for calculating segmentation quality and recognition quality in

`PanopticQuality`

metric (#2381)Added

`pretty-errors`

for improving error prints (#2431)Added support for

`torch.float`

weighted networks for FID and KID calculations (#2483)Added

`zero_division`

argument to selected classification metrics (#2198)

### [1.4.0] - Changed¶

Made

`__getattr__`

and`__setattr__`

of`ClasswiseWrapper`

more general (#2424)

### [1.4.0] - Fixed¶

Fix getitem for metric collection when prefix/postfix is set (#2430)

Fixed axis names with Precision-Recall curve (#2462)

Fixed list synchronization with partly empty lists (#2468)

Fixed memory leak in metrics using list states (#2492)

Fixed bug in computation of

`ERGAS`

metric (#2498)Fixed

`BootStrapper`

wrapper not working with`kwargs`

provided argument (#2503)Fixed warnings being suppressed in

`MeanAveragePrecision`

when requested (#2501)Fixed corner-case in

`binary_average_precision`

when only negative samples are provided (#2507)

## [1.3.2] - 2024-03-18¶

### [1.3.2] - Fixed¶

Fixed negative variance estimates in certain image metrics (#2378)

Fixed dtype being changed by deepspeed for certain regression metrics (#2379)

Fixed plotting of metric collection when prefix/postfix is set (#2429)

Fixed bug when

`top_k>1`

and`average="macro"`

for classification metrics (#2423)Fixed case where label prediction tensors in classification metrics were not validated correctly (#2427)

Fixed how auc scores are calculated in

`PrecisionRecallCurve.plot`

methods (#2437)

## [1.3.1] - 2024-02-12¶

### [1.3.1] - Fixed¶

Fixed how backprop is handled in

`LPIPS`

metric (#2326)Fixed

`MultitaskWrapper`

not being able to be logged in lightning when using metric collections (#2349)Fixed high memory consumption in

`Perplexity`

metric (#2346)Fixed cached network in

`FeatureShare`

not being moved to the correct device (#2348)Fix naming of statistics in

`MeanAveragePrecision`

with custom max det thresholds (#2367)Fixed custom aggregation in retrieval metrics (#2364)

Fixed initialize aggregation metrics with default floating type (#2366)

Fixed plotting of confusion matrices (#2358)

## [1.3.0] - 2024-01-10¶

### [1.3.0] - Added¶

Added more tokenizers for

`SacreBLEU`

metric (#2068)Added support for logging

`MultiTaskWrapper`

directly with lightnings`log_dict`

method (#2213)Added

`FeatureShare`

wrapper to share submodules containing feature extractors between metrics (#2120)Added new metrics to image domain:

Added

`average`

argument to multiclass versions of`PrecisionRecallCurve`

and`ROC`

(#2084)Added confidence scores when

`extended_summary=True`

in`MeanAveragePrecision`

(#2212)Added

`RetrievalAUROC`

metric (#2251)Added

`aggregate`

argument to retrieval metrics (#2220)Added utility functions in

`segmentation.utils`

for future segmentation metrics (#2105)

### [1.3.0] - Changed¶

### [1.3.0] - Deprecated¶

### [1.3.0] - Fixed¶

## [1.2.1] - 2023-11-30¶

### [1.2.1] - Added¶

### [1.2.1] - Changed¶

### [1.2.1] - Removed¶

Removed unused

`lpips`

third-party package as dependency of`LearnedPerceptualImagePatchSimilarity`

metric (#2230)

### [1.2.1] - Fixed¶

Fixed numerical stability bug in

`LearnedPerceptualImagePatchSimilarity`

metric (#2144)Fixed numerical stability issue in

`UniversalImageQualityIndex`

metric (#2222)Fixed incompatibility for

`MeanAveragePrecision`

with`pycocotools`

backend when too little`max_detection_thresholds`

are provided (#2219)Fixed support for half precision in Perplexity metric (#2235)

Fixed device and dtype for

`LearnedPerceptualImagePatchSimilarity`

functional metric (#2234)Fixed bug in

`Metric._reduce_states(...)`

when using`dist_sync_fn="cat"`

(#2226)Fixed bug in

`CosineSimilarity`

where 2d is expected but 1d input was given (#2241)Fixed bug in

`MetricCollection`

when using compute groups and`compute`

is called more than once (#2211)

## [1.2.0] - 2023-09-22¶

### [1.2.0] - Added¶

Added metric to cluster package:

`MutualInformationScore`

(#2008)`RandScore`

(#2025)`NormalizedMutualInfoScore`

(#2029)`AdjustedRandScore`

(#2032)`CalinskiHarabaszScore`

(#2036)`DunnIndex`

(#2049)`HomogeneityScore`

(#2053)`CompletenessScore`

(#2053)`VMeasureScore`

(#2053)`FowlkesMallowsIndex`

(#2066)`AdjustedMutualInfoScore`

(#2058)`DaviesBouldinScore`

(#2071)

Added

`backend`

argument to`MeanAveragePrecision`

(#2034)

## [1.1.2] - 2023-09-11¶

### [1.1.2] - Fixed¶

Fixed tie breaking in ndcg metric (#2031)

Fixed bug in

`BootStrapper`

when very few samples were evaluated that could lead to crash (#2052)Fixed bug when creating multiple plots that lead to not all plots being shown (#2060)

Fixed performance issues in

`RecallAtFixedPrecision`

for large batch sizes (#2042)Fixed bug related to

`MetricCollection`

used with custom metrics have`prefix`

/`postfix`

attributes (#2070)

## [1.1.1] - 2023-08-29¶

### [1.1.1] - Added¶

Added

`average`

argument to`MeanAveragePrecision`

(#2018)

### [1.1.1] - Fixed¶

Fixed bug in

`PearsonCorrCoef`

is updated on single samples at a time (#2019)Fixed support for pixel-wise MSE (#2017)

Fixed bug in

`MetricCollection`

when used with multiple metrics that return dicts with same keys (#2027)Fixed bug in detection intersection metrics when

`class_metrics=True`

resulting in wrong values (#1924)Fixed missing attributes

`higher_is_better`

,`is_differentiable`

for some metrics (#2028)

## [1.1.0] - 2023-08-22¶

### [1.1.0] - Added¶

Added source aggregated signal-to-distortion ratio (SA-SDR) metric (#1882

Added

`VisualInformationFidelity`

to image package (#1830)Added

`EditDistance`

to text package (#1906)Added

`top_k`

argument to`RetrievalMRR`

in retrieval package (#1961)Added support for evaluating

`"segm"`

and`"bbox"`

detection in`MeanAveragePrecision`

at the same time (#1928)Added

`PerceptualPathLength`

to image package (#1939)Added support for multioutput evaluation in

`MeanSquaredError`

(#1937)Added argument

`extended_summary`

to`MeanAveragePrecision`

such that precision, recall, iou can be easily returned (#1983)Added warning to

`ClipScore`

if long captions are detected and truncate (#2001)Added

`CLIPImageQualityAssessment`

to multimodal package (#1931)Added new property

`metric_state`

to all metrics for users to investigate currently stored tensors in memory (#2006)

## [1.0.3] - 2023-08-08¶

### [1.0.3] - Added¶

Added warning to

`MeanAveragePrecision`

if too many detections are observed (#1978)

### [1.0.3] - Fixed¶

## [1.0.2] - 2023-08-02¶

### [1.0.2] - Added¶

Added warning to

`PearsonCorrCoeff`

if input has a very small variance for its given dtype (#1926)

### [1.0.2] - Changed¶

Changed all non-task specific classification metrics to be true subtypes of

`Metric`

(#1963)

### [1.0.2] - Fixed¶

Fixed bug in

`CalibrationError`

where calculations for double precision input was performed in float precision (#1919)Fixed bug related to the

`prefix/postfix`

arguments in`MetricCollection`

and`ClasswiseWrapper`

being duplicated (#1918)Fixed missing AUC score when plotting classification metrics that support the

`score`

argument (#1948)

## [1.0.1] - 2023-07-13¶

### [1.0.1] - Fixed¶

Fixes corner case when using

`MetricCollection`

together with aggregation metrics (#1896)Fixed the use of

`max_fpr`

in`AUROC`

metric when only one class is present (#1895)Fixed bug related to empty predictions for

`IntersectionOverUnion`

metric (#1892)Fixed bug related to

`MeanMetric`

and broadcasting of weights when Nans are present (#1898)Fixed bug related to expected input format of pycoco in

`MeanAveragePrecision`

(#1913)

## [1.0.0] - 2023-07-04¶

### [1.0.0] - Added¶

Added

`prefix`

and`postfix`

arguments to`ClasswiseWrapper`

(#1866)Added speech-to-reverberation modulation energy ratio (SRMR) metric (#1792, #1872)

Added new global arg

`compute_with_cache`

to control caching behaviour after`compute`

method (#1754)Added

`ComplexScaleInvariantSignalNoiseRatio`

for audio package (#1785)Added

`Running`

wrapper for calculate running statistics (#1752)Added

`RelativeAverageSpectralError`

and`RootMeanSquaredErrorUsingSlidingWindow`

to image package (#816)Added support for

`SpecificityAtSensitivity`

Metric (#1432)Added support for plotting of metrics through

`.plot()`

method ( #1328, #1481, #1480, #1490, #1581, #1585, #1593, #1600, #1605, #1610, #1609, #1621, #1624, #1623, #1638, #1631, #1650, #1639, #1660, #1682, #1786, )Added support for plotting of audio metrics through

`.plot()`

method (#1434)Added

`classes`

to output from`MAP`

metric (#1419)Added Binary group fairness metrics to classification package (#1404)

Added

`MinkowskiDistance`

to regression package (#1362)Added

`pairwise_minkowski_distance`

to pairwise package (#1362)Added

`PSNRB`

metric (#1421)Added

`ClassificationTask`

Enum and use in metrics (#1479)Added

`ignore_index`

option to`exact_match`

metric (#1540)Add parameter

`top_k`

to`RetrievalMAP`

(#1501)Added support for deterministic evaluation on GPU for metrics that uses

`torch.cumsum`

operator (#1499)Added support for plotting of aggregation metrics through

`.plot()`

method (#1485)Added support for python 3.11 (#1612)

Added support for auto clamping of input for metrics that uses the

`data_range`

([#1606](argument https://github.com/Lightning-AI/metrics/pull/1606))Added

`ModifiedPanopticQuality`

metric to detection package (#1627)Added

`PrecisionAtFixedRecall`

metric to classification package (#1683)Added multiple metrics to detection package (#1284)

`IntersectionOverUnion`

`GeneralizedIntersectionOverUnion`

`CompleteIntersectionOverUnion`

`DistanceIntersectionOverUnion`

Added

`MultitaskWrapper`

to wrapper package (#1762)Added

`RelativeSquaredError`

metric to regression package (#1765)Added

`MemorizationInformedFrechetInceptionDistance`

metric to image package (#1580)

### [1.0.0] - Changed¶

Changed

`permutation_invariant_training`

to allow using a`'permutation-wise'`

metric function (#1794)Changed

`update_count`

and`update_called`

from private to public methods (#1370)Raise exception for invalid kwargs in Metric base class (#1427)

Extend

`EnumStr`

raising`ValueError`

for invalid value (#1479)Improve speed and memory consumption of binned

`PrecisionRecallCurve`

with large number of samples (#1493)Changed

`__iter__`

method from raising`NotImplementedError`

to`TypeError`

by setting to`None`

(#1538)`FID`

metric will now raise an error if too few samples are provided (#1655)Allowed FID with

`torch.float64`

(#1628)Changed

`LPIPS`

implementation to no more rely on third-party package (#1575)Changed FID matrix square root calculation from

`scipy`

to`torch`

(#1708)Changed calculation in

`PearsonCorrCoeff`

to be more robust in certain cases (#1729)Changed

`MeanAveragePrecision`

to`pycocotools`

backend (#1832)

### [1.0.0] - Deprecated¶

### [1.0.0] - Removed¶

Support for python 3.7 (#1640)

### [1.0.0] - Fixed¶

Fixed support in

`MetricTracker`

for`MultioutputWrapper`

and nested structures (#1608)Fixed restrictive check in

`PearsonCorrCoef`

(#1649)Fixed integration with

`jsonargparse`

and`LightningCLI`

(#1651)Fixed corner case in calibration error for zero confidence input (#1648)

Fix precision-recall curve based computations for float target (#1642)

Fixed missing kwarg squeeze in

`MultiOutputWrapper`

(#1675)Fixed padding removal for 3d input in

`MSSSIM`

(#1674)Fixed

`max_det_threshold`

in MAP detection (#1712)Fixed states being saved in metrics that use

`register_buffer`

(#1728)Fixed states not being correctly synced and device transferred in

`MeanAveragePrecision`

for`iou_type="segm"`

(#1763)Fixed use of

`prefix`

and`postfix`

in nested`MetricCollection`

(#1773)Fixed

`ax`

plotting logging in `MetricCollection (#1783)Fixed lookup for punkt sources being downloaded in

`RougeScore`

(#1789)Fixed integration with lightning for

`CompositionalMetric`

(#1761)Fixed several bugs in

`SpectralDistortionIndex`

metric (#1808)Fixed bug for corner cases in

`MatthewsCorrCoef`

( #1812, #1863 )Fixed support for half precision in

`PearsonCorrCoef`

(#1819)Fixed number of bugs related to

`average="macro"`

in classification metrics (#1821)Fixed off-by-one issue when

`ignore_index = num_classes + 1`

in Multiclass-jaccard (#1860)

## [0.11.4] - 2023-03-10¶

### [0.11.4] - Fixed¶

Fixed evaluation of

`R2Score`

with near constant target (#1576)Fixed dtype conversion when metric is submodule (#1583)

Fixed bug related to

`top_k>1`

and`ignore_index!=None`

in`StatScores`

based metrics (#1589)Fixed corner case for

`PearsonCorrCoef`

when running in ddp mode but only on single device (#1587)Fixed overflow error for specific cases in

`MAP`

when big areas are calculated (#1607)

## [0.11.3] - 2023-02-28¶

### [0.11.3] - Fixed¶

## [0.11.2] - 2023-02-21¶

### [0.11.2] - Fixed¶

## [0.11.1] - 2023-01-30¶

### [0.11.1] - Fixed¶

Fixed type checking on the

`maximize`

parameter at the initialization of`MetricTracker`

(#1428)Fixed mixed precision autocast for

`SSIM`

metric (#1454)Fixed checking for

`nltk.punkt`

in`RougeScore`

if a machine is not online (#1456)Fixed wrongly reset method in

`MultioutputWrapper`

(#1460)Fixed dtype checking in

`PrecisionRecallCurve`

for`target`

tensor (#1457)

## [0.11.0] - 2022-11-30¶

### [0.11.0] - Added¶

Added

`MulticlassExactMatch`

to classification metrics (#1343)Added

`TotalVariation`

to image package (#978)Added

`CLIPScore`

to new multimodal package (#1314)Added regression metrics:

Added new nominal metrics:

Added option to pass

`distributed_available_fn`

to metrics to allow checks for custom communication backend for making`dist_sync_fn`

actually useful (#1301)Added

`normalize`

argument to`Inception`

,`FID`

,`KID`

metrics (#1246)

### [0.11.0] - Changed¶

### [0.11.0] - Removed¶

### [0.11.0] - Fixed¶

Fixed precision bug in

`pairwise_euclidean_distance`

(#1352)

## [0.10.3] - 2022-11-16¶

### [0.10.3] - Fixed¶

## [0.10.2] - 2022-10-31¶

### [0.10.2] - Changed¶

Changed in-place operation to out-of-place operation in

`pairwise_cosine_similarity`

(#1288)

### [0.10.2] - Fixed¶

Fixed high memory usage for certain classification metrics when

`average='micro'`

(#1286)Fixed precision problems when

`structural_similarity_index_measure`

was used with autocast (#1291)Fixed slow performance for confusion matrix based metrics (#1302)

Fixed restrictive dtype checking in

`spearman_corrcoef`

when used with autocast (#1303)

## [0.10.1] - 2022-10-21¶

### [0.10.1] - Fixed¶

## [0.10.0] - 2022-10-04¶

### [0.10.0] - Added¶

Added a new NLP metric

`InfoLM`

(#915)Added

`Perplexity`

metric (#922)Added

`ConcordanceCorrCoef`

metric to regression package (#1201)Added argument

`normalize`

to`LPIPS`

metric (#1216)Added support for multiprocessing of batches in

`PESQ`

metric (#1227)Added support for multioutput in

`PearsonCorrCoef`

and`SpearmanCorrCoef`

(#1200)

### [0.10.0] - Changed¶

Classification refactor ( #1054, #1143, #1145, #1151, #1159, #1163, #1167, #1175, #1189, #1197, #1215, #1195 )

Changed update in

`FID`

metric to be done in online fashion to save memory (#1199)Improved performance of retrieval metrics (#1242)

Changed

`SSIM`

and`MSSSIM`

update to be online to reduce memory usage (#1231)

### [0.10.0] - Deprecated¶

Deprecated

`BinnedAveragePrecision`

,`BinnedPrecisionRecallCurve`

,`BinnedRecallAtFixedPrecision`

(#1163)`BinnedAveragePrecision`

-> use`AveragePrecision`

with`thresholds`

arg`BinnedPrecisionRecallCurve`

-> use`AveragePrecisionRecallCurve`

with`thresholds`

arg`BinnedRecallAtFixedPrecision`

-> use`RecallAtFixedPrecision`

with`thresholds`

arg

Renamed and refactored

`LabelRankingAveragePrecision`

,`LabelRankingLoss`

and`CoverageError`

(#1167)`LabelRankingAveragePrecision`

->`MultilabelRankingAveragePrecision`

`LabelRankingLoss`

->`MultilabelRankingLoss`

`CoverageError`

->`MultilabelCoverageError`

Deprecated

`KLDivergence`

and`AUC`

from classification package (#1189)`KLDivergence`

moved to`regression`

packageInstead of

`AUC`

use`torchmetrics.utils.compute.auc`

### [0.10.0] - Fixed¶

## [0.9.3] - 2022-08-22¶

### [0.9.3] - Added¶

Added global option

`sync_on_compute`

to disable automatic synchronization when`compute`

is called (#1107)

### [0.9.3] - Fixed¶

## [0.9.2] - 2022-06-29¶

### [0.9.2] - Fixed¶

Fixed mAP calculation for areas with 0 predictions (#1080)

Fixed bug where avg precision state and auroc state was not merge when using MetricCollections (#1086)

Skip box conversion if no boxes are present in

`MeanAveragePrecision`

(#1097)Fixed inconsistency in docs and code when setting

`average="none"`

in`AveragePrecision`

metric (#1116)

## [0.9.1] - 2022-06-08¶

### [0.9.1] - Added¶

### [0.9.1] - Fixed¶

## [0.9.0] - 2022-05-30¶

### [0.9.0] - Added¶

Added

`RetrievalPrecisionRecallCurve`

and`RetrievalRecallAtFixedPrecision`

to retrieval package (#951)Added class property

`full_state_update`

that determines`forward`

should call`update`

once or twice ( #984, #1033)Added support for nested metric collections (#1003)

Added

`Dice`

to classification package (#1021)Added support to segmentation type

`segm`

as IOU for mean average precision (#822)

### [0.9.0] - Changed¶

Renamed

`reduction`

argument to`average`

in Jaccard score and added additional options (#874)

### [0.9.0] - Removed¶

### [0.9.0] - Fixed¶

Fixed non-empty state dict for a few metrics (#1012)

Fixed bug when comparing states while finding compute groups (#1022)

Fixed

`torch.double`

support in stat score metrics (#1023)Fixed

`FID`

calculation for non-equal size real and fake input (#1028)Fixed case where

`KLDivergence`

could output`Nan`

(#1030)Fixed deterministic for PyTorch<1.8 (#1035)

Fixed default value for

`mdmc_average`

in`Accuracy`

(#1036)Fixed missing copy of property when using compute groups in

`MetricCollection`

(#1052)

## [0.8.2] - 2022-05-06¶

### [0.8.2] - Fixed¶

## [0.8.1] - 2022-04-27¶

### [0.8.1] - Changed¶

Reimplemented the

`signal_distortion_ratio`

metric, which removed the absolute requirement of`fast-bss-eval`

(#964)

### [0.8.1] - Fixed¶

## [0.8.0] - 2022-04-14¶

### [0.8.0] - Added¶

Added

`WeightedMeanAbsolutePercentageError`

to regression package (#948)Added new classification metrics:

Added new image metric:

Added support for

`MetricCollection`

in`MetricTracker`

(#718)Added support for 3D image and uniform kernel in

`StructuralSimilarityIndexMeasure`

(#818)Added smart update of

`MetricCollection`

(#709)Added

`ClasswiseWrapper`

for better logging of classification metrics with multiple output values (#832)Added

`**kwargs`

argument for passing additional arguments to base class (#833)Added negative

`ignore_index`

for the Accuracy metric (#362)Added

`adaptive_k`

for the`RetrievalPrecision`

metric (#910)Added

`reset_real_features`

argument image quality assessment metrics (#722)Added new keyword argument

`compute_on_cpu`

to all metrics (#867)

### [0.8.0] - Changed¶

Made

`num_classes`

in`jaccard_index`

a required argument (#853, #914)Added normalizer, tokenizer to ROUGE metric (#838)

Improved shape checking of

`permutation_invariant_training`

(#864)Allowed reduction

`None`

(#891)`MetricTracker.best_metric`

will now give a warning when computing on metric that do not have a best (#913)

### [0.8.0] - Deprecated¶

### [0.8.0] - Removed¶

Removed support for versions of Pytorch-Lightning lower than v1.5 (#788)

Removed deprecated functions, and warnings in Text (#773)

`WER`

and`functional.wer`

Removed deprecated functions and warnings in Image (#796)

`SSIM`

and`functional.ssim`

`PSNR`

and`functional.psnr`

Removed deprecated functions, and warnings in classification and regression (#806)

`FBeta`

and`functional.fbeta`

`F1`

and`functional.f1`

`Hinge`

and`functional.hinge`

`IoU`

and`functional.iou`

`MatthewsCorrcoef`

`PearsonCorrcoef`

`SpearmanCorrcoef`

Removed deprecated functions, and warnings in detection and pairwise (#804)

`MAP`

and`functional.pairwise.manhatten`

Removed deprecated functions, and warnings in Audio (#805)

`PESQ`

and`functional.audio.pesq`

`PIT`

and`functional.audio.pit`

`SDR`

and`functional.audio.sdr`

and`functional.audio.si_sdr`

`SNR`

and`functional.audio.snr`

and`functional.audio.si_snr`

`STOI`

and`functional.audio.stoi`

Removed unused

`get_num_classes`

from`torchmetrics.utilities.data`

(#914)

### [0.8.0] - Fixed¶

## [0.7.3] - 2022-03-23¶

### [0.7.3] - Fixed¶

Fixed unsafe log operation in

`TweedieDeviace`

for power=1 (#847)Fixed bug in MAP metric related to either no ground truth or no predictions (#884)

Fixed

`ConfusionMatrix`

,`AUROC`

and`AveragePrecision`

on GPU when running in deterministic mode (#900)Fixed NaN or Inf results returned by

`signal_distortion_ratio`

(#899)Fixed memory leak when using

`update`

method with tensor where`requires_grad=True`

(#902)

## [0.7.2] - 2022-02-10¶

### [0.7.2] - Fixed¶

Minor patches in JOSS paper.

## [0.7.1] - 2022-02-03¶

### [0.7.1] - Changed¶

### [0.7.1] - Fixed¶

## [0.7.0] - 2022-01-17¶

### [0.7.0] - Added¶

Added NLP metrics:

Added

`MultiScaleSSIM`

into image metrics (#679)Added Signal to Distortion Ratio (

`SDR`

) to audio package (#565)Added

`MinMaxMetric`

to wrappers (#556)Added

`ignore_index`

to retrieval metrics (#676)Added support for multi references in

`ROUGEScore`

(#680)Added a default VSCode devcontainer configuration (#621)

### [0.7.0] - Changed¶

Scalar metrics will now consistently have additional dimensions squeezed (#622)

Metrics having third party dependencies removed from global import (#463)

Untokenized for

`BLEUScore`

input stay consistent with all the other text metrics (#640)Arguments reordered for

`TER`

,`BLEUScore`

,`SacreBLEUScore`

,`CHRFScore`

now expect input order as predictions first and target second (#696)Changed dtype of metric state from

`torch.float`

to`torch.long`

in`ConfusionMatrix`

to accommodate larger values (#715)Unify

`preds`

,`target`

input argument’s naming across all text metrics (#723, #727)`bert`

,`bleu`

,`chrf`

,`sacre_bleu`

,`wip`

,`wil`

,`cer`

,`ter`

,`wer`

,`mer`

,`rouge`

,`squad`

### [0.7.0] - Deprecated¶

Renamed IoU -> Jaccard Index (#662)

Renamed text WER metric (#714)

`functional.wer`

->`functional.word_error_rate`

`WER`

->`WordErrorRate`

Renamed correlation coefficient classes: (#710)

`MatthewsCorrcoef`

->`MatthewsCorrCoef`

`PearsonCorrcoef`

->`PearsonCorrCoef`

`SpearmanCorrcoef`

->`SpearmanCorrCoef`

Renamed audio STOI metric: (#753, #758)

`audio.STOI`

to`audio.ShortTimeObjectiveIntelligibility`

`functional.audio.stoi`

to`functional.audio.short_time_objective_intelligibility`

Renamed audio PESQ metrics: (#751)

`functional.audio.pesq`

->`functional.audio.perceptual_evaluation_speech_quality`

`audio.PESQ`

->`audio.PerceptualEvaluationSpeechQuality`

Renamed audio SDR metrics: (#711)

`functional.sdr`

->`functional.signal_distortion_ratio`

`functional.si_sdr`

->`functional.scale_invariant_signal_distortion_ratio`

`SDR`

->`SignalDistortionRatio`

`SI_SDR`

->`ScaleInvariantSignalDistortionRatio`

Renamed audio SNR metrics: (#712)

`functional.snr`

->`functional.signal_distortion_ratio`

`functional.si_snr`

->`functional.scale_invariant_signal_noise_ratio`

`SNR`

->`SignalNoiseRatio`

`SI_SNR`

->`ScaleInvariantSignalNoiseRatio`

Renamed F-score metrics: (#731, #740)

`functional.f1`

->`functional.f1_score`

`F1`

->`F1Score`

`functional.fbeta`

->`functional.fbeta_score`

`FBeta`

->`FBetaScore`

Renamed Hinge metric: (#734)

`functional.hinge`

->`functional.hinge_loss`

`Hinge`

->`HingeLoss`

Renamed image PSNR metrics (#732)

`functional.psnr`

->`functional.peak_signal_noise_ratio`

`PSNR`

->`PeakSignalNoiseRatio`

Renamed image PIT metric: (#737)

`functional.pit`

->`functional.permutation_invariant_training`

`PIT`

->`PermutationInvariantTraining`

Renamed image SSIM metric: (#747)

`functional.ssim`

->`functional.scale_invariant_signal_noise_ratio`

`SSIM`

->`StructuralSimilarityIndexMeasure`

Renamed detection

`MAP`

to`MeanAveragePrecision`

metric (#754)Renamed Fidelity & LPIPS image metric: (#752)

`image.FID`

->`image.FrechetInceptionDistance`

`image.KID`

->`image.KernelInceptionDistance`

`image.LPIPS`

->`image.LearnedPerceptualImagePatchSimilarity`

### [0.7.0] - Removed¶

### [0.7.0] - Fixed¶

Fixed MetricCollection kwargs filtering when no

`kwargs`

are present in update signature (#707)

## [0.6.2] - 2021-12-15¶

### [0.6.2] - Fixed¶

## [0.6.1] - 2021-12-06¶

### [0.6.1] - Changed¶

### [0.6.1] - Fixed¶

## [0.6.0] - 2021-10-28¶

### [0.6.0] - Added¶

Added audio metrics:

Added Information retrieval metrics:

Added NLP metrics:

Added other metrics:

Added

`MAP`

(mean average precision) metric to new detection package (#467)Added support for float targets in

`nDCG`

metric (#437)Added

`average`

argument to`AveragePrecision`

metric for reducing multi-label and multi-class problems (#477)Added

`MultioutputWrapper`

(#510)Added metric sweeping:

Added simple aggregation metrics:

`SumMetric`

,`MeanMetric`

,`CatMetric`

,`MinMetric`

,`MaxMetric`

(#506)Added pairwise submodule with metrics (#553)

`pairwise_cosine_similarity`

`pairwise_euclidean_distance`

`pairwise_linear_similarity`

`pairwise_manhatten_distance`

### [0.6.0] - Changed¶

`AveragePrecision`

will now as default output the`macro`

average for multilabel and multiclass problems (#477)`half`

,`double`

,`float`

will no longer change the dtype of the metric states. Use`metric.set_dtype`

instead (#493)Renamed

`AverageMeter`

to`MeanMetric`

(#506)Changed

`is_differentiable`

from property to a constant attribute (#551)`ROC`

and`AUROC`

will no longer throw an error when either the positive or negative class is missing. Instead return 0 score and give a warning

### [0.6.0] - Deprecated¶

Deprecated

`functional.self_supervised.embedding_similarity`

in favour of new pairwise submodule

### [0.6.0] - Removed¶

Removed

`dtype`

property (#493)

### [0.6.0] - Fixed¶

Fixed bug in

`F1`

with`average='macro'`

and`ignore_index!=None`

(#495)Fixed bug in

`pit`

by using the returned first result to initialize device and type (#533)Fixed

`SSIM`

metric using too much memory (#539)Fixed bug where

`device`

property was not properly update when metric was a child of a module (#542)

## [0.5.1] - 2021-08-30¶

### [0.5.1] - Added¶

### [0.5.1] - Changed¶

Added support for float targets in

`nDCG`

metric (#437)

### [0.5.1] - Removed¶

### [0.5.1] - Fixed¶

Fixed ranking of samples in

`SpearmanCorrCoef`

metric (#448)Fixed bug where compositional metrics where unable to sync because of type mismatch (#454)

Fixed metric hashing (#478)

Fixed

`BootStrapper`

metrics not working on GPU (#462)Fixed the semantic ordering of kernel height and width in

`SSIM`

metric (#474)

## [0.5.0] - 2021-08-09¶

### [0.5.0] - Added¶

Added

**Text-related (NLP) metrics**:Added

`MetricTracker`

wrapper metric for keeping track of the same metric over multiple epochs (#238)Added other metrics:

Added support in

`nDCG`

metric for target with values larger than 1 (#349)Added support for negative targets in

`nDCG`

metric (#378)Added

`None`

as reduction option in`CosineSimilarity`

metric (#400)Allowed passing labels in (n_samples, n_classes) to

`AveragePrecision`

(#386)

### [0.5.0] - Changed¶

Moved

`psnr`

and`ssim`

from`functional.regression.*`

to`functional.image.*`

(#382)Moved

`image_gradient`

from`functional.image_gradients`

to`functional.image.gradients`

(#381)Moved

`R2Score`

from`regression.r2score`

to`regression.r2`

(#371)Pearson metric now only store 6 statistics instead of all predictions and targets (#380)

Use

`torch.argmax`

instead of`torch.topk`

when`k=1`

for better performance (#419)Moved check for number of samples in R2 score to support single sample updating (#426)

### [0.5.0] - Deprecated¶

### [0.5.0] - Removed¶

Removed restriction that

`threshold`

has to be in (0,1) range to support logit input ( #351 #401)Removed restriction that

`preds`

could not be bigger than`num_classes`

to support logit input (#357)Removed module

`regression.psnr`

and`regression.ssim`

(#382):Removed (#379):

function

`functional.mean_relative_error`

`num_thresholds`

argument in`BinnedPrecisionRecallCurve`

### [0.5.0] - Fixed¶

Fixed bug where classification metrics with

`average='macro'`

would lead to wrong result if a class was missing (#303)Fixed

`weighted`

,`multi-class`

AUROC computation to allow for 0 observations of some class, as contribution to final AUROC is 0 (#376)Fixed that

`_forward_cache`

and`_computed`

attributes are also moved to the correct device if metric is moved (#413)Fixed calculation in

`IoU`

metric when using`ignore_index`

argument (#328)

## [0.4.1] - 2021-07-05¶

### [0.4.1] - Changed¶

### [0.4.1] - Fixed¶

Fixed DDP by

`is_sync`

logic to`Metric`

(#339)

## [0.4.0] - 2021-06-29¶

### [0.4.0] - Added¶

Added

**Image-related metrics**:Added

**Audio metrics**: SNR, SI_SDR, SI_SNR (#292)Added other metrics:

Added

`add_metrics`

method to`MetricCollection`

for adding additional metrics after initialization (#221)Added pre-gather reduction in the case of

`dist_reduce_fx="cat"`

to reduce communication cost (#217)Added better error message for

`AUROC`

when`num_classes`

is not provided for multiclass input (#244)Added support for unnormalized scores (e.g. logits) in

`Accuracy`

,`Precision`

,`Recall`

,`FBeta`

,`F1`

,`StatScore`

,`Hamming`

,`ConfusionMatrix`

metrics (#200)Added

`squared`

argument to`MeanSquaredError`

for computing`RMSE`

(#249)Added

`is_differentiable`

property to`ConfusionMatrix`

,`F1`

,`FBeta`

,`Hamming`

,`Hinge`

,`IOU`

,`MatthewsCorrcoef`

,`Precision`

,`Recall`

,`PrecisionRecallCurve`

,`ROC`

,`StatScores`

(#253)Added

`sync`

and`sync_context`

methods for manually controlling when metric states are synced (#302)

### [0.4.0] - Changed¶

Forward cache is reset when

`reset`

method is called (#260)Improved per-class metric handling for imbalanced datasets for

`precision`

,`recall`

,`precision_recall`

,`fbeta`

,`f1`

,`accuracy`

, and`specificity`

(#204)Decorated

`torch.jit.unused`

to`MetricCollection`

forward (#307)Renamed

`thresholds`

argument to binned metrics for manually controlling the thresholds (#322)

### [0.4.0] - Deprecated¶

### [0.4.0] - Removed¶

Removed argument

`is_multiclass`

(#319)

### [0.4.0] - Fixed¶

## [0.3.2] - 2021-05-10¶

### [0.3.2] - Added¶

### [0.3.2] - Changed¶

### [0.3.2] - Removed¶

Removed

`numpy`

as direct dependency (#212)

### [0.3.2] - Fixed¶

Fixed auc calculation and add tests (#197)

Fixed loading persisted metric states using

`load_state_dict()`

(#202)Fixed

`PSNR`

not working with`DDP`

(#214)Fixed metric calculation with unequal batch sizes (#220)

Fixed metric concatenation for list states for zero-dim input (#229)

Fixed numerical instability in

`AUROC`

metric for large input (#230)

## [0.3.1] - 2021-04-21¶

## [0.3.0] - 2021-04-20¶

### [0.3.0] - Added¶

Added

`BootStrapper`

to easily calculate confidence intervals for metrics (#101)Added Binned metrics (#128)

Added metrics for Information Retrieval ((PL^5032)):

Added other metrics:

Added

`average='micro'`

as an option in AUROC for multilabel problems (#110)Added multilabel support to

`ROC`

metric (#114)Added

`AverageMeter`

for ad-hoc averages of values (#138)Added

`prefix`

argument to`MetricCollection`

(#70)Added

`__getitem__`

as metric arithmetic operation (#142)Added property

`is_differentiable`

to metrics and test for differentiability (#154)Added support for

`average`

,`ignore_index`

and`mdmc_average`

in`Accuracy`

metric (#166)Added

`postfix`

arg to`MetricCollection`

(#188)

### [0.3.0] - Changed¶

Changed

`ExplainedVariance`

from storing all preds/targets to tracking 5 statistics (#68)Changed behaviour of

`confusionmatrix`

for multilabel data to better match`multilabel_confusion_matrix`

from sklearn (#134)Updated FBeta arguments (#111)

Changed

`reset`

method to use`detach.clone()`

instead of`deepcopy`

when resetting to default (#163)Metrics passed as dict to

`MetricCollection`

will now always be in deterministic order (#173)Allowed

`MetricCollection`

pass metrics as arguments (#176)

### [0.3.0] - Deprecated¶

Rename argument

`is_multiclass`

->`multiclass`

(#162)

### [0.3.0] - Removed¶

Prune remaining deprecated (#92)

### [0.3.0] - Fixed¶

## [0.2.0] - 2021-03-12¶

### [0.2.0] - Changed¶

### [0.2.0] - Removed¶

## [0.1.0] - 2021-02-22¶

Added

`Accuracy`

metric now generalizes to Top-k accuracy for (multi-dimensional) multi-class inputs using the`top_k`

parameter (PL^4838)Added

`Accuracy`

metric now enables the computation of subset accuracy for multi-label or multi-dimensional multi-class inputs with the`subset_accuracy`

parameter (PL^4838)Added

`HammingDistance`

metric to compute the hamming distance (loss) (PL^4838)Added

`StatScores`

metric to compute the number of true positives, false positives, true negatives and false negatives (PL^4839)Added

`R2Score`

metric (PL^5241)Added

`MetricCollection`

(PL^4318)Added

`.clone()`

method to metrics (PL^4318)Added

`IoU`

class interface (PL^4704)The

`Recall`

and`Precision`

metrics (and their functional counterparts`recall`

and`precision`

) can now be generalized to Recall@K and Precision@K with the use of`top_k`

parameter (PL^4842)Added compositional metrics (PL^5464)

Added AUC/AUROC class interface (PL^5479)

Added

`QuantizationAwareTraining`

callback (PL^5706)Added

`ConfusionMatrix`

class interface (PL^4348)Added multiclass AUROC metric (PL^4236)

Added

`PrecisionRecallCurve, ROC, AveragePrecision`

class metric (PL^4549)Classification metrics overhaul (PL^4837)

Added

`F1`

class metric (PL^4656)Added metrics aggregation in Horovod and fixed early stopping (PL^3775)

Added

`persistent(mode)`

method to metrics, to enable and disable metric states being added to`state_dict`

(PL^4482)Added unification of regression metrics (PL^4166)

Added persistent flag to

`Metric.add_state`

(PL^4195)Added classification metrics (PL^4043)

Added EMB similarity (PL^3349)

Added SSIM metrics (PL^2671)

Added BLEU metrics (PL^2535)