# Theil’s U¶

## Module Interface¶

class torchmetrics.nominal.TheilsU(num_classes, nan_strategy='replace', nan_replace_value=0.0, **kwargs)[source]

Compute Theil’s U statistic measuring the association between two categorical (nominal) data series.

$U(X|Y) = \frac{H(X) - H(X|Y)}{H(X)}$

where $$H(X)$$ is entropy of variable $$X$$ while $$H(X|Y)$$ is the conditional entropy of $$X$$ given $$Y$$. It is also know as the Uncertainty Coefficient. Theils’s U is an asymmetric coefficient, i.e. $$TheilsU(preds, target) \neq TheilsU(target, preds)$$, so the order of the inputs matters. The output values lies in [0, 1], where a 0 means y has no information about x while value 1 means y has complete information about x.

As input to forward and update the metric accepts the following input:

• preds (Tensor): Either 1D or 2D tensor of categorical (nominal) data from the first data series (called X in the above definition) with shape (batch_size,) or (batch_size, num_classes), respectively.

• target (Tensor): Either 1D or 2D tensor of categorical (nominal) data from the second data series (called Y in the above definition) with shape (batch_size,) or (batch_size, num_classes), respectively.

As output of forward and compute the metric returns the following output:

• theils_u (Tensor): Scalar tensor containing the Theil’s U statistic.

Parameters:

Example:

>>> from torchmetrics.nominal import TheilsU
>>> _ = torch.manual_seed(42)
>>> preds = torch.randint(10, (10,))
>>> target = torch.randint(10, (10,))
>>> metric = TheilsU(num_classes=10)
>>> metric(preds, target)
tensor(0.8530)

plot(val=None, ax=None)[source]

Plot a single or multiple values from the metric.

Parameters:
Return type:
Returns:

Figure and Axes object

Raises:

ModuleNotFoundError – If matplotlib is not installed

>>> # Example plotting a single value
>>> import torch
>>> from torchmetrics.nominal import TheilsU
>>> metric = TheilsU(num_classes=10)
>>> metric.update(torch.randint(10, (10,)), torch.randint(10, (10,)))
>>> fig_, ax_ = metric.plot()

>>> # Example plotting multiple values
>>> import torch
>>> from torchmetrics.nominal import TheilsU
>>> metric = TheilsU(num_classes=10)
>>> values = [ ]
>>> for _ in range(10):
...     values.append(metric(torch.randint(10, (10,)), torch.randint(10, (10,))))
>>> fig_, ax_ = metric.plot(values)


## Functional Interface¶

torchmetrics.functional.nominal.theils_u(preds, target, nan_strategy='replace', nan_replace_value=0.0)[source]

Compute Theils Uncertainty coefficient statistic measuring the association between two nominal data series.

$U(X|Y) = \frac{H(X) - H(X|Y)}{H(X)}$

where $$H(X)$$ is entropy of variable $$X$$ while $$H(X|Y)$$ is the conditional entropy of $$X$$ given $$Y$$.

Theils’s U is an asymmetric coefficient, i.e. $$TheilsU(preds, target) \neq TheilsU(target, preds)$$.

The output values lies in [0, 1]. 0 means y has no information about x while value 1 means y has complete information about x.

Parameters:
• preds (Tensor) – 1D or 2D tensor of categorical (nominal) data - 1D shape: (batch_size,) - 2D shape: (batch_size, num_classes)

• target (Tensor) – 1D or 2D tensor of categorical (nominal) data - 1D shape: (batch_size,) - 2D shape: (batch_size, num_classes)

• nan_strategy (Literal['replace', 'drop']) – Indication of whether to replace or drop NaN values

• nan_replace_value (Optional[float]) – Value to replace NaNs when nan_strategy = 'replace'

Return type:

Tensor

Returns:

Tensor containing Theil’s U statistic

Example

>>> from torchmetrics.functional.nominal import theils_u
>>> _ = torch.manual_seed(42)
>>> preds = torch.randint(10, (10,))
>>> target = torch.randint(10, (10,))
>>> theils_u(preds, target)
tensor(0.8530)


### theils_u_matrix¶

torchmetrics.functional.nominal.theils_u_matrix(matrix, nan_strategy='replace', nan_replace_value=0.0)[source]

Compute Theil’s U statistic between a set of multiple variables.

This can serve as a convenient tool to compute Theil’s U statistic for analyses of correlation between categorical variables in your dataset.

Parameters:
• matrix (Tensor) – A tensor of categorical (nominal) data, where: - rows represent a number of data points - columns represent a number of categorical (nominal) features

• nan_strategy (Literal['replace', 'drop']) – Indication of whether to replace or drop NaN values

• nan_replace_value (Optional[float]) – Value to replace NaNs when nan_strategy = 'replace'

Return type:

Tensor

Returns:

Theil’s U statistic for a dataset of categorical variables

Example

>>> from torchmetrics.functional.nominal import theils_u_matrix
>>> _ = torch.manual_seed(42)
>>> matrix = torch.randint(0, 4, (200, 5))
>>> theils_u_matrix(matrix)
tensor([[1.0000, 0.0202, 0.0142, 0.0196, 0.0353],
[0.0202, 1.0000, 0.0070, 0.0136, 0.0065],
[0.0143, 0.0070, 1.0000, 0.0125, 0.0206],
[0.0198, 0.0137, 0.0125, 1.0000, 0.0312],
[0.0352, 0.0065, 0.0204, 0.0308, 1.0000]])