Lip Vertex Error¶

Module Interface¶

class torchmetrics.multimodal.lve.LipVertexError(mouth_map, validate_args=True, **kwargs)[source]¶

Implements Lip Vertex Error (LVE) metric for 3D talking head evaluation.

The Lip Vertex Error (LVE) metric evaluates the quality of lip synchronization in 3D facial animations by measuring the maximum Euclidean distance (L2 error) between corresponding lip vertices of the generated and ground truth meshes for each frame. The metric is defined as:

\[\text{LVE} = \frac{1}{N} \sum_{i=1}^{N} \max_{v \in \text{lip}} \|x_{i,v} - \hat{x}_{i,v}\|_2^2\]

where \(N\) is the number of frames, \(x_{i,v}\) represents the 3D coordinates of vertex \(v\) in the lip region of the ground truth frame \(i\), and \(\hat{x}_{i,v}\) represents the corresponding vertex in the predicted frame. The metric computes the maximum squared L2 distance between corresponding lip vertices for each frame and averages across all frames. A lower LVE value indicates better lip synchronization quality.

As input to forward and update, the metric accepts the following input:

preds (Tensor): Predicted vertices tensor of shape (T, V, 3) where T is number of frames,
V is number of vertices, and 3 represents XYZ coordinates
target (Tensor): Ground truth vertices tensor of shape (T’, V, 3) where T’ can be different
from T

As output of forward and compute, the metric returns the following output:

lve_score (Tensor): A scalar tensor containing the mean Lip Vertex Error value across
all frames.

Parameters:

mouth_map¶ (List[int]) – List of vertex indices corresponding to the mouth region
validate_args¶ (bool) – bool indicating if input arguments and tensors should be validated for correctness. Set to False for faster computations.
kwargs¶ (Any) – Additional keyword arguments, see Advanced metric settings for more info.

Raises:

ValueError – If the number of dimensions of vertices_pred or vertices_gt is not 3. If vertex dimensions (V) or coordinate dimensions (3) don’t match If mouth_map is empty or contains invalid indices

Example

>>> import torch
>>> from torchmetrics.functional.multimodal import lip_vertex_error
>>> vertices_pred = torch.randn(10, 100, 3, generator=torch.manual_seed(42))
>>> vertices_gt = torch.randn(10, 100, 3, generator=torch.manual_seed(43))
>>> mouth_map = [0, 1, 2, 3, 4]
>>> lip_vertex_error(vertices_pred, vertices_gt, mouth_map)
tensor(12.7688)

compute()[source]¶

Compute the Lip Vertex Error over all accumulated states.

Returns:: A scalar tensor with the mean LVE value
Return type:: torch.Tensor

plot(val=None, ax=None)[source]¶

Plot a single or multiple values from the metric.

Parameters:

val¶ (Union[Tensor, Sequence[Tensor], None]) – Either a single result from calling metric.forward or metric.compute or a list of these results. If no value is provided, will automatically call metric.compute and plot that result.
ax¶ (Optional[Axes]) – An matplotlib axis object. If provided will add plot to that axis

Return type:

tuple[Figure, Union[Axes, ndarray]]

Returns:

Figure and Axes object

Raises:

ModuleNotFoundError – If matplotlib is not installed

>>> # Example plotting a single value
>>> import torch
>>> from torchmetrics.multimodal.lve import LipVertexError
>>> metric = LipVertexError(mouth_map=[0, 1, 2, 3, 4])
>>> vertices_pred = torch.randn(10, 100, 3, generator=torch.manual_seed(42))
>>> vertices_gt = torch.randn(10, 100, 3, generator=torch.manual_seed(43))
>>> metric.update(vertices_pred, vertices_gt)
>>> fig_, ax_ = metric.plot()

>>> # Example plotting multiple values
>>> import torch
>>> from torchmetrics.multimodal.lve import LipVertexError
>>> metric = LipVertexError(mouth_map=[0, 1, 2, 3, 4])
>>> values = []
>>> for _ in range(10):
...     vertices_pred = torch.randn(10, 100, 3, generator=torch.manual_seed(42+_))
...     vertices_gt = torch.randn(10, 100, 3, generator=torch.manual_seed(43+_))
...     values.append(metric(vertices_pred, vertices_gt))
>>> fig_, ax_ = metric.plot(values)

update(vertices_pred, vertices_gt)[source]¶

Update metric states with predictions and targets.

Parameters:

vertices_pred¶ (Tensor) – Predicted vertices tensor of shape (T, V, 3) where T is number of frames, V is number of vertices, and 3 represents XYZ coordinates
vertices_gt¶ (Tensor) – Ground truth vertices tensor of shape (T’, V, 3) where T’ can be different from T

Return type:

None

Functional Interface¶

torchmetrics.functional.multimodal.lve.lip_vertex_error(vertices_pred, vertices_gt, mouth_map, validate_args=True)[source]¶

Compute Lip Vertex Error (LVE) for 3D talking head evaluation.

The Lip Vertex Error (LVE) metric evaluates the quality of lip synchronization in 3D facial animations by measuring the maximum Euclidean distance (L2 error) between corresponding lip vertices of the generated and ground truth meshes for each frame. The metric is defined as:

\[\text{LVE} = \frac{1}{N} \sum_{i=1}^{N} \max_{v \in \text{lip}} \|x_{i,v} - \hat{x}_{i,v}\|_2^2\]

where \(N\) is the number of frames, \(x_{i,v}\) represents the 3D coordinates of vertex \(v\) in the lip region of the ground truth frame \(i\), and \(\hat{x}_{i,v}\) represents the corresponding vertex in the predicted frame. The metric computes the maximum squared L2 distance between corresponding lip vertices for each frame and averages across all frames. A lower LVE value indicates better lip synchronization quality.

Parameters:

vertices_pred¶ (Tensor) – Predicted vertices tensor of shape (T, V, 3) where T is number of frames, V is number of vertices, and 3 represents XYZ coordinates
vertices_gt¶ (Tensor) – Ground truth vertices tensor of shape (T’, V, 3) where T’ can be different from T
mouth_map¶ (List[int]) – List of vertex indices corresponding to the mouth region
validate_args¶ (bool) – bool indicating if input arguments and tensors should be validated for correctness. Set to False for faster computations.

Returns:

Scalar tensor containing the mean LVE value across all frames

Return type:

torch.Tensor

Raises:

ValueError – If the number of dimensions of vertices_pred or vertices_gt is not 3. If vertex dimensions (V) or coordinate dimensions (3) don’t match If mouth_map is empty or contains invalid indices

Example

>>> import torch
>>> from torchmetrics.functional.multimodal import lip_vertex_error
>>> vertices_pred = torch.randn(10, 100, 3, generator=torch.manual_seed(42))
>>> vertices_gt = torch.randn(10, 100, 3, generator=torch.manual_seed(43))
>>> mouth_map = [0, 1, 2, 3, 4]
>>> lip_vertex_error(vertices_pred, vertices_gt, mouth_map)
tensor(12.7688)