Pearson’s Contingency Coefficient¶
Module Interface¶
- class torchmetrics.nominal.PearsonsContingencyCoefficient(num_classes, nan_strategy='replace', nan_replace_value=0.0, **kwargs)[source]¶
Compute Pearson’s Contingency Coefficient statistic.
This metric measures the association between two categorical (nominal) data series.
\[Pearson = \sqrt{\frac{\chi^2 / n}{1 + \chi^2 / n}}\]where
\[\chi^2 = \sum_{i,j} \ frac{\left(n_{ij} - \frac{n_{i.} n_{.j}}{n}\right)^2}{\frac{n_{i.} n_{.j}}{n}}\]where \(n_{ij}\) denotes the number of times the values \((A_i, B_j)\) are observed with \(A_i, B_j\) represent frequencies of values in
preds
andtarget
, respectively. Pearson’s Contingency Coefficient is a symmetric coefficient, i.e. \(Pearson(preds, target) = Pearson(target, preds)\), so order of input arguments does not matter. The output values lies in [0, 1] with 1 meaning the perfect association.As input to
forward
andupdate
the metric accepts the following input:preds
(Tensor
): Either 1D or 2D tensor of categorical (nominal) data from the first data series with shape(batch_size,)
or(batch_size, num_classes)
, respectively.target
(Tensor
): Either 1D or 2D tensor of categorical (nominal) data from the second data series with shape(batch_size,)
or(batch_size, num_classes)
, respectively.
As output of
forward
andcompute
the metric returns the following output:pearsons_cc
(Tensor
): Scalar tensor containing the Pearsons Contingency Coefficient statistic.
- Parameters:
num_classes¶ (
int
) – Integer specifying the number of classesnan_strategy¶ (
Literal
['replace'
,'drop'
]) – Indication of whether to replace or dropNaN
valuesnan_replace_value¶ (
Optional
[float
]) – Value to replaceNaN``s when ``nan_strategy = 'replace'
kwargs¶ (
Any
) – Additional keyword arguments, see Advanced metric settings for more info.
- Raises:
ValueError – If nan_strategy is not one of ‘replace’ and ‘drop’
ValueError – If nan_strategy is equal to ‘replace’ and nan_replace_value is not an int or float
Example:
>>> from torch import randint, randn >>> from torchmetrics.nominal import PearsonsContingencyCoefficient >>> preds = randint(0, 4, (100,)) >>> target = (preds + randn(100)).round().clamp(0, 4) >>> pearsons_contingency_coefficient = PearsonsContingencyCoefficient(num_classes=5) >>> pearsons_contingency_coefficient(preds, target) tensor(0.6948)
- plot(val=None, ax=None)[source]¶
Plot a single or multiple values from the metric.
- Parameters:
val¶ (
Union
[Tensor
,Sequence
[Tensor
],None
]) – Either a single result from calling metric.forward or metric.compute or a list of these results. If no value is provided, will automatically call metric.compute and plot that result.ax¶ (
Optional
[Axes
]) – An matplotlib axis object. If provided will add plot to that axis
- Return type:
- Returns:
Figure and Axes object
- Raises:
ModuleNotFoundError – If matplotlib is not installed
>>> # Example plotting a single value >>> import torch >>> from torchmetrics.nominal import PearsonsContingencyCoefficient >>> metric = PearsonsContingencyCoefficient(num_classes=5) >>> metric.update(torch.randint(0, 4, (100,)), torch.randint(0, 4, (100,))) >>> fig_, ax_ = metric.plot()
>>> # Example plotting multiple values >>> import torch >>> from torchmetrics.nominal import PearsonsContingencyCoefficient >>> metric = PearsonsContingencyCoefficient(num_classes=5) >>> values = [ ] >>> for _ in range(10): ... values.append(metric(torch.randint(0, 4, (100,)), torch.randint(0, 4, (100,)))) >>> fig_, ax_ = metric.plot(values)
Functional Interface¶
- torchmetrics.functional.nominal.pearsons_contingency_coefficient(preds, target, nan_strategy='replace', nan_replace_value=0.0)[source]¶
Compute Pearson’s Contingency Coefficient for measuring the association between two categorical data series.
\[Pearson = \sqrt{\frac{\chi^2 / n}{1 + \chi^2 / n}}\]where
\[\chi^2 = \sum_{i,j} \ frac{\left(n_{ij} - \frac{n_{i.} n_{.j}}{n}\right)^2}{\frac{n_{i.} n_{.j}}{n}}\]where \(n_{ij}\) denotes the number of times the values \((A_i, B_j)\) are observed with \(A_i, B_j\) represent frequencies of values in
preds
andtarget
, respectively.Pearson’s Contingency Coefficient is a symmetric coefficient, i.e. \(Pearson(preds, target) = Pearson(target, preds)\).
The output values lies in [0, 1] with 1 meaning the perfect association.
- Parameters:
1D or 2D tensor of categorical (nominal) data:
1D shape: (batch_size,)
2D shape: (batch_size, num_classes)
1D or 2D tensor of categorical (nominal) data:
1D shape: (batch_size,)
2D shape: (batch_size, num_classes)
nan_strategy¶ (
Literal
['replace'
,'drop'
]) – Indication of whether to replace or dropNaN
valuesnan_replace_value¶ (
Optional
[float
]) – Value to replaceNaN``s when ``nan_strategy = 'replace'
- Return type:
- Returns:
Pearson’s Contingency Coefficient
Example
>>> from torch import randint, round >>> from torchmetrics.functional.nominal import pearsons_contingency_coefficient >>> preds = randint(0, 4, (100,)) >>> target = round(preds + torch.randn(100)).clamp(0, 4) >>> pearsons_contingency_coefficient(preds, target) tensor(0.6948)
pearsons_contingency_coefficient_matrix¶
- torchmetrics.functional.nominal.pearsons_contingency_coefficient_matrix(matrix, nan_strategy='replace', nan_replace_value=0.0)[source]¶
Compute Pearson’s Contingency Coefficient statistic between a set of multiple variables.
This can serve as a convenient tool to compute Pearson’s Contingency Coefficient for analyses of correlation between categorical variables in your dataset.
- Parameters:
A tensor of categorical (nominal) data, where:
rows represent a number of data points
columns represent a number of categorical (nominal) features
nan_strategy¶ (
Literal
['replace'
,'drop'
]) – Indication of whether to replace or dropNaN
valuesnan_replace_value¶ (
Optional
[float
]) – Value to replaceNaN``s when ``nan_strategy = 'replace'
- Return type:
- Returns:
Pearson’s Contingency Coefficient statistic for a dataset of categorical variables
Example
>>> from torch import randint >>> from torchmetrics.functional.nominal import pearsons_contingency_coefficient_matrix >>> matrix = randint(0, 4, (200, 5)) >>> pearsons_contingency_coefficient_matrix(matrix) tensor([[1.0000, 0.2326, 0.1959, 0.2262, 0.2989], [0.2326, 1.0000, 0.1386, 0.1895, 0.1329], [0.1959, 0.1386, 1.0000, 0.1840, 0.2335], [0.2262, 0.1895, 0.1840, 1.0000, 0.2737], [0.2989, 0.1329, 0.2335, 0.2737, 1.0000]])