Perceptual Evaluation of Speech Quality (PESQ)¶
Module Interface¶
- class torchmetrics.audio.pesq.PerceptualEvaluationSpeechQuality(fs, mode, n_processes=1, **kwargs)[source]¶
Calculate Perceptual Evaluation of Speech Quality (PESQ).
It’s a recognized industry standard for audio quality that takes into considerations characteristics such as: audio sharpness, call volume, background noise, clipping, audio interference etc. PESQ returns a score between -0.5 and 4.5 with the higher scores indicating a better quality.
This metric is a wrapper for the pesq package. Note that input will be moved to
cpu
to perform the metric calculation.As input to
forward
andupdate
the metric accepts the following inputpreds
(Tensor
): float tensor with shape(...,time)
target
(Tensor
): float tensor with shape(...,time)
As output of forward and compute the metric returns the following output
pesq
(Tensor
): float tensor of PESQ value reduced across the batch
Hint
Using this metrics requires you to have
pesq
install. Either install aspip install torchmetrics[audio]
orpip install pesq
.pesq
will compile with your currently installed version of numpy, meaning that if you upgrade numpy at some point in the future you will most likely have to reinstallpesq
.Caution
The
forward
andcompute
methods in this class return a single (reduced) PESQ value for a batch. To obtain a PESQ value for each sample, you may use the functional counterpart inperceptual_evaluation_speech_quality()
.- Parameters:
fs¶ (
int
) – sampling frequency, should be 16000 or 8000 (Hz)keep_same_device¶ – whether to move the pesq value to the device of preds
n_processes¶ (
int
) – integer specifying the number of processes to run in parallel for the metric calculation. Only applies to batches of data and ifmultiprocessing
package is installed.kwargs¶ (
Any
) – Additional keyword arguments, see Advanced metric settings for more info.
- Raises:
ModuleNotFoundError – If
pesq
package is not installedValueError – If
fs
is not either8000
or16000
ValueError – If
mode
is not either"wb"
or"nb"
Example
>>> from torch import randn >>> from torchmetrics.audio import PerceptualEvaluationSpeechQuality >>> preds = randn(8000) >>> target = randn(8000) >>> pesq = PerceptualEvaluationSpeechQuality(8000, 'nb') >>> pesq(preds, target) tensor(2.2885) >>> wb_pesq = PerceptualEvaluationSpeechQuality(16000, 'wb') >>> wb_pesq(preds, target) tensor(1.6805)
- plot(val=None, ax=None)[source]¶
Plot a single or multiple values from the metric.
- Parameters:
val¶ (
Union
[Tensor
,Sequence
[Tensor
],None
]) – Either a single result from calling metric.forward or metric.compute or a list of these results. If no value is provided, will automatically call metric.compute and plot that result.ax¶ (
Optional
[Axes
]) – An matplotlib axis object. If provided will add plot to that axis
- Return type:
- Returns:
Figure and Axes object
- Raises:
ModuleNotFoundError – If matplotlib is not installed
>>> # Example plotting a single value >>> import torch >>> from torchmetrics.audio import PerceptualEvaluationSpeechQuality >>> metric = PerceptualEvaluationSpeechQuality(8000, 'nb') >>> metric.update(torch.rand(8000), torch.rand(8000)) >>> fig_, ax_ = metric.plot()
>>> # Example plotting multiple values >>> import torch >>> from torchmetrics.audio import PerceptualEvaluationSpeechQuality >>> metric = PerceptualEvaluationSpeechQuality(8000, 'nb') >>> values = [ ] >>> for _ in range(10): ... values.append(metric(torch.rand(8000), torch.rand(8000))) >>> fig_, ax_ = metric.plot(values)
Functional Interface¶
- torchmetrics.functional.audio.pesq.perceptual_evaluation_speech_quality(preds, target, fs, mode, keep_same_device=False, n_processes=1)[source]¶
Calculate Perceptual Evaluation of Speech Quality (PESQ).
It’s a recognized industry standard for audio quality that takes into considerations characteristics such as: audio sharpness, call volume, background noise, clipping, audio interference etc. PESQ returns a score between -0.5 and 4.5 with the higher scores indicating a better quality.
This metric is a wrapper for the pesq package. Note that input will be moved to cpu to perform the metric calculation.
Hint
Usingsing this metrics requires you to have
pesq
install. Either install aspip install torchmetrics[audio]
orpip install pesq
. Note thatpesq
will compile with your currently installed version of numpy, meaning that if you upgrade numpy at some point in the future you will most likely have to reinstallpesq
.- Parameters:
fs¶ (
int
) – sampling frequency, should be 16000 or 8000 (Hz)keep_same_device¶ (
bool
) – whether to move the pesq value to the device of predsn_processes¶ (
int
) – integer specifying the number of processes to run in parallel for the metric calculation. Only applies to batches of data and ifmultiprocessing
package is installed.
- Return type:
- Returns:
Float tensor with shape
(...,)
of PESQ values per sample- Raises:
ModuleNotFoundError – If
pesq
package is not installedValueError – If
fs
is not either8000
or16000
ValueError – If
mode
is not either"wb"
or"nb"
RuntimeError – If
preds
andtarget
do not have the same shape
Example
>>> from torch import randn >>> from torchmetrics.functional.audio.pesq import perceptual_evaluation_speech_quality >>> preds = randn(8000) >>> target = randn(8000) >>> perceptual_evaluation_speech_quality(preds, target, 8000, 'nb') tensor(2.2885) >>> perceptual_evaluation_speech_quality(preds, target, 16000, 'wb') tensor(1.6805)