Retrieval Precision Recall Curve

Module Interface

class torchmetrics.retrieval.RetrievalPrecisionRecallCurve(max_k=None, adaptive_k=False, empty_target_action='neg', ignore_index=None, aggregation='mean', **kwargs)[source]

Compute precision-recall pairs for different k (from 1 to max_k).

In a ranked retrieval context, appropriate sets of retrieved documents are naturally given by the top k retrieved documents. Recall is the fraction of relevant documents retrieved among all the relevant documents. Precision is the fraction of relevant documents among all the retrieved documents. For each such set, precision and recall values can be plotted to give a recall-precision curve.

As input to forward and update the metric accepts the following input:

preds (Tensor): A float tensor of shape (N, ...)
target (Tensor): A long or bool tensor of shape (N, ...)
indexes (Tensor): A long tensor of shape (N, ...) which indicate to which query a prediction belongs

As output to forward and compute the metric returns the following output:

precisions (Tensor): A tensor with the fraction of relevant documents among all the retrieved documents.
recalls (Tensor): A tensor with the fraction of relevant documents retrieved among all the relevant documents
top_k (Tensor): A tensor with k from 1 to max_k

All indexes, preds and target must have the same dimension and will be flatten at the beginning, so that for example, a tensor of shape (N, M) is treated as (N * M, ). Predictions will be first grouped by indexes and then will be computed as the mean of the metric over each query.

Parameters:

max_k (Optional[int]) – Calculate recall and precision for all possible top k from 1 to max_k (default: None, which considers all possible top k)
adaptive_k (bool) – adjust k to min(k, number of documents) for each query
empty_target_action (str) –
Specify what to do with queries that do not have at least a positive target. Choose from:
- 'neg': those queries count as 0.0 (default)
- 'pos': those queries count as 1.0
- 'skip': skip those queries; if all queries are skipped, 0.0 is returned
- 'error': raise a ValueError
ignore_index (Optional[int]) – Ignore predictions where the target is equal to this number.
aggregation (Union[Literal['mean', 'median', 'min', 'max'], Callable]) –
Specify how to aggregate over indexes. Can either a custom callable function that takes in a single tensor and returns a scalar value or one of the following strings:
- 'mean': average value is returned
- 'median': median value is returned
- 'max': max value is returned
- 'min': min value is returned
kwargs (Any) – Additional keyword arguments, see Advanced metric settings for more info.

Raises:

ValueError – If empty_target_action is not one of error, skip, neg or pos.
ValueError – If ignore_index is not None or an integer.
ValueError – If max_k parameter is not None or not an integer larger than 0.

Example

>>>>>> from torch import tensor
>>> from torchmetrics.retrieval import RetrievalPrecisionRecallCurve
>>> indexes = tensor([0, 0, 0, 0, 1, 1, 1])
>>> preds = tensor([0.4, 0.01, 0.5, 0.6, 0.2, 0.3, 0.5])
>>> target = tensor([True, False, False, True, True, False, True])
>>> r = RetrievalPrecisionRecallCurve(max_k=4)
>>> precisions, recalls, top_k = r(preds, target, indexes=indexes)
>>> precisions
tensor([1.0000, 0.5000, 0.6667, 0.5000])
>>> recalls
tensor([0.5000, 0.5000, 1.0000, 1.0000])
>>> top_k
tensor([1, 2, 3, 4])

plot(curve=None, ax=None)[source]

Plot a single or multiple values from the metric.

Parameters:

curve (Optional[tuple[Tensor, Tensor, Tensor]]) – the output of either metric.compute or metric.forward. If no value is provided, will automatically call metric.compute and plot that result.
ax (Optional[Axes]) – An matplotlib axis object. If provided will add plot to that axis

Return type:

tuple[Figure, Union[Axes, ndarray]]

Returns:

Figure and Axes object

Raises:

ModuleNotFoundError – If matplotlib is not installed

>>>>>> import torch
>>> from torchmetrics.retrieval import RetrievalPrecisionRecallCurve
>>> # Example plotting a single value
>>> metric = RetrievalPrecisionRecallCurve()
>>> metric.update(torch.rand(10,), torch.randint(2, (10,)), indexes=torch.randint(2,(10,)))
>>> fig_, ax_ = metric.plot()

../_images/precision_recall_curve-11.png

Functional Interface

torchmetrics.functional.retrieval.retrieval_precision_recall_curve(preds, target, max_k=None, adaptive_k=False)[source]

Compute precision-recall pairs for different k (from 1 to max_k).

In a ranked retrieval context, appropriate sets of retrieved documents are naturally given by the top k retrieved documents.

Recall is the fraction of relevant documents retrieved among all the relevant documents. Precision is the fraction of relevant documents among all the retrieved documents.

For each such set, precision and recall values can be plotted to give a recall-precision curve.

preds and target should be of the same shape and live on the same device. If no target is True, 0 is returned. target must be either bool or integers and preds must be float, otherwise an error is raised.

Parameters:

preds (Tensor) – estimated probabilities of each document to be relevant.
target (Tensor) – ground truth about each document being relevant or not.
max_k (Optional[int]) – Calculate recall and precision for all possible top k from 1 to max_k (default: None, which considers all possible top k)
adaptive_k (bool) – adjust max_k to min(max_k, number of documents) for each query

Return type:

tuple[Tensor, Tensor, Tensor]

Returns:

Tensor with the precision values for each k (at top_k) from 1 to max_k Tensor with the recall values for each k (at top_k) from 1 to max_k Tensor with all possibles k

Raises:

ValueError – If max_k is not None or an integer larger than 0.
ValueError – If adaptive_k is not boolean.

Example

>>>>>> from torch import tensor
>>> from  torchmetrics.functional import retrieval_precision_recall_curve
>>> preds = tensor([0.2, 0.3, 0.5])
>>> target = tensor([True, False, True])
>>> precisions, recalls, top_k = retrieval_precision_recall_curve(preds, target, max_k=2)
>>> precisions
tensor([1.0000, 0.5000])
>>> recalls
tensor([0.5000, 0.5000])
>>> top_k
tensor([1, 2])