Non-Intrusive Speech Quality Assessment (NISQA v2.0)¶
Module Interface¶
- class torchmetrics.audio.nisqa.NonIntrusiveSpeechQualityAssessment(fs, **kwargs)[source]¶
Non-Intrusive Speech Quality Assessment (NISQA v2.0) [1], [2].
As input to
forward
andupdate
the metric accepts the following inputpreds
(Tensor
): float tensor with shape(...,time)
As output of
forward
andcompute
the metric returns the following outputnisqa
(Tensor
): float tensor reduced across the batch with shape(5,)
corresponding to overall MOS, noisiness, discontinuity, coloration and loudness in that order
Hint
Using this metric requires you to have
librosa
andrequests
installed. Install aspip install librosa requests
.Caution
The
forward
andcompute
methods in this class return values reduced across the batch. To obtain values for each sample, you may use the functional counterpartnon_intrusive_speech_quality_assessment()
.- Parameters:
- Raises:
ModuleNotFoundError – If
librosa
orrequests
are not installed
Example
>>> import torch >>> from torchmetrics.audio import NonIntrusiveSpeechQualityAssessment >>> _ = torch.manual_seed(42) >>> preds = torch.randn(16000) >>> nisqa = NonIntrusiveSpeechQualityAssessment(16000) >>> nisqa(preds) tensor([1.0433, 1.9545, 2.6087, 1.3460, 1.7117])
References
[1] G. Mittag and S. Möller, “Non-intrusive speech quality assessment for super-wideband speech communication networks”, in Proc. ICASSP, 2019.
[2] G. Mittag, B. Naderi, A. Chehadi and S. Möller, “NISQA: A deep CNN-self-attention model for multidimensional speech quality prediction with crowdsourced datasets”, in Proc. INTERSPEECH, 2021.
- plot(val=None, ax=None)[source]¶
Plot a single or multiple values from the metric.
- Parameters:
val¶ (
Union
[Tensor
,Sequence
[Tensor
],None
]) – Either a single result from callingmetric.forward
ormetric.compute
or a list of these results. If no value is provided, will automatically callmetric.compute
and plot that result.ax¶ (
Optional
[Axes
]) – A matplotlib axis object. If provided will add plot to that axis
- Return type:
- Returns:
Figure and Axes object
- Raises:
ModuleNotFoundError – If
matplotlib
is not installed
>>> # Example plotting a single value >>> import torch >>> from torchmetrics.audio import NonIntrusiveSpeechQualityAssessment >>> metric = NonIntrusiveSpeechQualityAssessment(16000) >>> metric.update(torch.randn(16000)) >>> fig_, ax_ = metric.plot()
>>> # Example plotting multiple values >>> import torch >>> from torchmetrics.audio import NonIntrusiveSpeechQualityAssessment >>> metric = NonIntrusiveSpeechQualityAssessment(16000) >>> values = [] >>> for _ in range(10): ... values.append(metric(torch.randn(16000))) >>> fig_, ax_ = metric.plot(values)
Functional Interface¶
- torchmetrics.functional.audio.nisqa.non_intrusive_speech_quality_assessment(preds, fs)[source]¶
Non-Intrusive Speech Quality Assessment (NISQA v2.0) [1], [2].
Hint
Usingsing this metric requires you to have
librosa
andrequests
installed. Install aspip install librosa requests
.- Parameters:
- Return type:
- Returns:
Float tensor with shape
(...,5)
corresponding to overall MOS, noisiness, discontinuity, coloration and loudness in that order- Raises:
ModuleNotFoundError – If
librosa
orrequests
are not installedRuntimeError – If the input is too short, causing the number of mel spectrogram windows to be zero
RuntimeError – If the input is too long, causing the number of mel spectrogram windows to exceed the maximum allowed
Example
>>> import torch >>> from torchmetrics.functional.audio.nisqa import non_intrusive_speech_quality_assessment >>> _ = torch.manual_seed(42) >>> preds = torch.randn(16000) >>> non_intrusive_speech_quality_assessment(preds, 16000) tensor([1.0433, 1.9545, 2.6087, 1.3460, 1.7117])
References
[1] G. Mittag and S. Möller, “Non-intrusive speech quality assessment for super-wideband speech communication networks”, in Proc. ICASSP, 2019.
[2] G. Mittag, B. Naderi, A. Chehadi and S. Möller, “NISQA: A deep CNN-self-attention model for multidimensional speech quality prediction with crowdsourced datasets”, in Proc. INTERSPEECH, 2021.