Speech-to-Reverberation Modulation Energy Ratio (SRMR)¶
Module Interface¶
- class torchmetrics.audio.srmr.SpeechReverberationModulationEnergyRatio(fs, n_cochlear_filters=23, low_freq=125, min_cf=4, max_cf=None, norm=False, fast=False, **kwargs)[source]
Calculate Speech-to-Reverberation Modulation Energy Ratio (SRMR).
SRMR is a non-intrusive metric for speech quality and intelligibility based on a modulation spectral representation of the speech signal. This code is translated from SRMRToolbox and SRMRpy.
As input to
forwardandupdatethe metric accepts the following inputpreds(Tensor): float tensor with shape(...,time)
As output of forward and compute the metric returns the following output
srmr(Tensor): float scaler tensor
Note
using this metrics requires you to have
gammatoneandtorchaudioinstalled. Either install aspip install torchmetrics[audio]orpip install torchaudioandpip install git+https://github.com/detly/gammatone.Note
This implementation is experimental, and might not be consistent with the matlab implementation SRMRToolbox, especially the fast implementation. The slow versions, a) fast=False, norm=False, max_cf=128, b) fast=False, norm=True, max_cf=30, have a relatively small inconsistence.
- Parameters:
n_cochlear_filters¶ (
int) – Number of filters in the acoustic filterbanklow_freq¶ (
float) – determines the frequency cutoff for the corresponding gammatone filterbank.min_cf¶ (
float) – Center frequency in Hz of the first modulation filter.max_cf¶ (
Optional[float]) – Center frequency in Hz of the last modulation filter. If None is given, then 30 Hz will be used for norm==False, otherwise 128 Hz will be used.fast¶ (
bool) – Use the faster version based on the gammatonegram. Note: this argument is inherited from SRMRpy. As the translated code is based to pytorch, setting fast=True may slow down the speed for calculating this metric on GPU.
- Raises:
ModuleNotFoundError – If
gammatoneortorchaudiopackage is not installed
Example
>>> import torch >>> from torchmetrics.audio import SpeechReverberationModulationEnergyRatio >>> g = torch.manual_seed(1) >>> preds = torch.randn(8000) >>> srmr = SpeechReverberationModulationEnergyRatio(8000) >>> srmr(preds) tensor(0.3354)
- plot(val=None, ax=None)[source]
Plot a single or multiple values from the metric.
- Parameters:
val¶ (
Union[Tensor,Sequence[Tensor],None]) – Either a single result from calling metric.forward or metric.compute or a list of these results. If no value is provided, will automatically call metric.compute and plot that result.ax¶ (
Optional[Axes]) – An matplotlib axis object. If provided will add plot to that axis
- Return type:
- Returns:
Figure and Axes object
- Raises:
ModuleNotFoundError – If matplotlib is not installed
>>> # Example plotting a single value >>> import torch >>> from torchmetrics.audio import SpeechReverberationModulationEnergyRatio >>> metric = SpeechReverberationModulationEnergyRatio(8000) >>> metric.update(torch.rand(8000)) >>> fig_, ax_ = metric.plot()
>>> # Example plotting multiple values >>> import torch >>> from torchmetrics.audio import SpeechReverberationModulationEnergyRatio >>> metric = SpeechReverberationModulationEnergyRatio(8000) >>> values = [ ] >>> for _ in range(10): ... values.append(metric(torch.rand(8000))) >>> fig_, ax_ = metric.plot(values)
Functional Interface¶
- torchmetrics.functional.audio.srmr.speech_reverberation_modulation_energy_ratio(preds, fs, n_cochlear_filters=23, low_freq=125, min_cf=4, max_cf=None, norm=False, fast=False)[source]
Calculate Speech-to-Reverberation Modulation Energy Ratio (SRMR).
SRMR is a non-intrusive metric for speech quality and intelligibility based on a modulation spectral representation of the speech signal. This code is translated from SRMRToolbox and SRMRpy.
- Parameters:
n_cochlear_filters¶ (
int) – Number of filters in the acoustic filterbanklow_freq¶ (
float) – determines the frequency cutoff for the corresponding gammatone filterbank.min_cf¶ (
float) – Center frequency in Hz of the first modulation filter.max_cf¶ (
Optional[float]) – Center frequency in Hz of the last modulation filter. If None is given, then 30 Hz will be used for norm==False, otherwise 128 Hz will be used.fast¶ (
bool) – Use the faster version based on the gammatonegram. Note: this argument is inherited from SRMRpy. As the translated code is based to pytorch, setting fast=True may slow down the speed for calculating this metric on GPU.
Note
using this metrics requires you to have
gammatoneandtorchaudioinstalled. Either install aspip install torchmetrics[audio]orpip install torchaudioandpip install git+https://github.com/detly/gammatone.Note
This implementation is experimental, and might not be consistent with the matlab implementation SRMRToolbox, especially the fast implementation. The slow versions, a) fast=False, norm=False, max_cf=128, b) fast=False, norm=True, max_cf=30, have a relatively small inconsistence.
- Returns:
srmr value, shape
(...)- Return type:
Tensor
- Raises:
ModuleNotFoundError – If
gammatoneortorchaudiopackage is not installed
Example
>>> import torch >>> from torchmetrics.functional.audio import speech_reverberation_modulation_energy_ratio >>> g = torch.manual_seed(1) >>> preds = torch.randn(8000) >>> speech_reverberation_modulation_energy_ratio(preds, 8000) tensor([0.3354], dtype=torch.float64)