Non-Intrusive Speech Quality Assessment (NISQA v2.0)

Module Interface

class, **kwargs)[source]

Non-Intrusive Speech Quality Assessment (NISQA v2.0) [1], [2].

As input to forward and update the metric accepts the following input

  • preds (Tensor): float tensor with shape (...,time)

As output of forward and compute the metric returns the following output

  • nisqa (Tensor): float tensor reduced across the batch with shape (5,) corresponding to overall MOS, noisiness, discontinuity, coloration and loudness in that order


Using this metric requires you to have librosa and requests installed. Install as pip install librosa requests.


The forward and compute methods in this class return values reduced across the batch. To obtain values for each sample, you may use the functional counterpart non_intrusive_speech_quality_assessment().


fs (int) – sampling frequency of input


ModuleNotFoundError – If librosa or requests are not installed


>>> import torch
>>> from import NonIntrusiveSpeechQualityAssessment
>>> _ = torch.manual_seed(42)
>>> preds = torch.randn(16000)
>>> nisqa = NonIntrusiveSpeechQualityAssessment(16000)
>>> nisqa(preds)
tensor([1.0433, 1.9545, 2.6087, 1.3460, 1.7117])


  • [1] G. Mittag and S. Möller, “Non-intrusive speech quality assessment for super-wideband speech communication networks”, in Proc. ICASSP, 2019.

  • [2] G. Mittag, B. Naderi, A. Chehadi and S. Möller, “NISQA: A deep CNN-self-attention model for multidimensional speech quality prediction with crowdsourced datasets”, in Proc. INTERSPEECH, 2021.

plot(val=None, ax=None)[source]

Plot a single or multiple values from the metric.

  • val (Union[Tensor, Sequence[Tensor], None]) – Either a single result from calling metric.forward or metric.compute or a list of these results. If no value is provided, will automatically call metric.compute and plot that result.

  • ax (Optional[Axes]) – A matplotlib axis object. If provided will add plot to that axis

Return type:

tuple[Figure, Union[Axes, ndarray]]


Figure and Axes object


ModuleNotFoundError – If matplotlib is not installed

>>> # Example plotting a single value
>>> import torch
>>> from import NonIntrusiveSpeechQualityAssessment
>>> metric = NonIntrusiveSpeechQualityAssessment(16000)
>>> metric.update(torch.randn(16000))
>>> fig_, ax_ = metric.plot()
>>> # Example plotting multiple values
>>> import torch
>>> from import NonIntrusiveSpeechQualityAssessment
>>> metric = NonIntrusiveSpeechQualityAssessment(16000)
>>> values = []
>>> for _ in range(10):
...     values.append(metric(torch.randn(16000)))
>>> fig_, ax_ = metric.plot(values)

Functional Interface, fs)[source]

Non-Intrusive Speech Quality Assessment (NISQA v2.0) [1], [2].


Usingsing this metric requires you to have librosa and requests installed. Install as pip install librosa requests.

  • preds (Tensor) – float tensor with shape (...,time)

  • fs (int) – sampling frequency of input

Return type:



Float tensor with shape (...,5) corresponding to overall MOS, noisiness, discontinuity, coloration and loudness in that order

  • ModuleNotFoundError – If librosa or requests are not installed

  • RuntimeError – If the input is too short, causing the number of mel spectrogram windows to be zero

  • RuntimeError – If the input is too long, causing the number of mel spectrogram windows to exceed the maximum allowed


>>> import torch
>>> from import non_intrusive_speech_quality_assessment
>>> _ = torch.manual_seed(42)
>>> preds = torch.randn(16000)
>>> non_intrusive_speech_quality_assessment(preds, 16000)
tensor([1.0433, 1.9545, 2.6087, 1.3460, 1.7117])


  • [1] G. Mittag and S. Möller, “Non-intrusive speech quality assessment for super-wideband speech communication networks”, in Proc. ICASSP, 2019.

  • [2] G. Mittag, B. Naderi, A. Chehadi and S. Möller, “NISQA: A deep CNN-self-attention model for multidimensional speech quality prediction with crowdsourced datasets”, in Proc. INTERSPEECH, 2021.