.. DO NOT EDIT.
.. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY.
.. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE:
.. "gallery/audio/pesq.py"
.. LINE NUMBERS ARE GIVEN BELOW.

.. only:: html

    .. note::
        :class: sphx-glr-download-link-note

        :ref:`Go to the end <sphx_glr_download_gallery_audio_pesq.py>`
        to download the full example code.

.. rst-class:: sphx-glr-example-title

.. _sphx_glr_gallery_audio_pesq.py:


Evaluating Speech Quality with PESQ metric
==============================================

This notebook will guide you through calculating the Perceptual Evaluation of Speech Quality (PESQ) score,
 a key metric in assessing how effective noise reduction and enhancement techniques are in improving speech quality.
 PESQ is widely adopted in industries such as telecommunications, VoIP, and audio processing.
 It provides an objective way to measure the perceived quality of speech signals from a human listener's perspective.

Imagine being on a noisy street, trying to have a phone call. The technology behind the scenes aims
 to clean up your voice and make it sound clearer on the other end. But how do engineers measure that improvement?
 This is where PESQ comes in. In this notebook, we will simulate a similar scenario, applying a simple noise reduction
 technique and using the PESQ score to evaluate how much the speech quality improves.

.. GENERATED FROM PYTHON SOURCE LINES 17-18

Import necessary libraries

.. GENERATED FROM PYTHON SOURCE LINES 18-25

.. code-block:: Python
   :lineno-start: 18

    import matplotlib.pyplot as plt
    import numpy as np
    import torch
    import torchaudio

    from torchmetrics.audio import PerceptualEvaluationSpeechQuality








.. GENERATED FROM PYTHON SOURCE LINES 26-28

Generate Synthetic Clean and Noisy Audio Signals
We'll generate a clean sine wave (representing a clean speech signal) and add white noise to simulate the noisy version.

.. GENERATED FROM PYTHON SOURCE LINES 28-54

.. code-block:: Python
   :lineno-start: 30



    def generate_sine_wave(frequency, duration, sample_rate, amplitude: float = 0.5):
        """Generate a clean sine wave at a given frequency."""
        t = torch.linspace(0, duration, int(sample_rate * duration))
        return amplitude * torch.sin(2 * np.pi * frequency * t)


    def add_noise(waveform: torch.Tensor, noise_factor: float = 0.05) -> torch.Tensor:
        """Add white noise to a waveform."""
        noise = noise_factor * torch.randn(waveform.size())
        return waveform + noise


    # Parameters for the synthetic audio
    sample_rate = 16000  # 16 kHz typical for speech
    duration = 3  # 3 seconds of audio
    frequency = 440  # A4 note, can represent a simple speech-like tone

    # Generate the clean sine wave
    clean_waveform = generate_sine_wave(frequency, duration, sample_rate)

    # Generate the noisy waveform by adding white noise
    noisy_waveform = add_noise(clean_waveform)









.. GENERATED FROM PYTHON SOURCE LINES 55-58

Apply Basic Noise Reduction Technique
In this step, we apply a simple spectral gating method for noise reduction using torchaudio's
`spectrogram` method. This is to simulate the enhancement of noisy speech.

.. GENERATED FROM PYTHON SOURCE LINES 58-75

.. code-block:: Python
   :lineno-start: 60



    def reduce_noise(noisy_signal: torch.Tensor, threshold: float = 0.2) -> torch.Tensor:
        """Basic noise reduction using spectral gating."""
        # Compute the spectrogram
        spec = torchaudio.transforms.Spectrogram()(noisy_signal)

        # Apply threshold-based gating: values below the threshold will be zeroed out
        spec_denoised = spec * (spec > threshold)

        # Convert back to the waveform
        return torchaudio.transforms.GriffinLim()(spec_denoised)


    # Apply noise reduction to the noisy waveform
    enhanced_waveform = reduce_noise(noisy_waveform)








.. GENERATED FROM PYTHON SOURCE LINES 76-79

Initialize the PESQ Metric
PESQ can be computed in two modes: 'wb' (wideband) or 'nb' (narrowband).
Here, we are using 'wb' mode for wideband speech quality evaluation.

.. GENERATED FROM PYTHON SOURCE LINES 79-81

.. code-block:: Python
   :lineno-start: 79

    pesq_metric = PerceptualEvaluationSpeechQuality(fs=sample_rate, mode="wb")








.. GENERATED FROM PYTHON SOURCE LINES 82-86

Compute PESQ Scores
We will calculate the PESQ scores for both the noisy and enhanced versions compared to the clean signal.
The PESQ scores give us a numerical evaluation of how well the enhanced speech
compares to the clean speech. Higher scores indicate better quality.

.. GENERATED FROM PYTHON SOURCE LINES 86-93

.. code-block:: Python
   :lineno-start: 87


    pesq_noisy = pesq_metric(clean_waveform, noisy_waveform)
    pesq_enhanced = pesq_metric(clean_waveform, enhanced_waveform)

    print(f"PESQ Score for Noisy Audio: {pesq_noisy.item():.4f}")
    print(f"PESQ Score for Enhanced Audio: {pesq_enhanced.item():.4f}")





.. rst-class:: sphx-glr-script-out

 .. code-block:: none

    PESQ Score for Noisy Audio: 3.0675
    PESQ Score for Enhanced Audio: 1.2623




.. GENERATED FROM PYTHON SOURCE LINES 94-96

Visualize the waveforms
We can visualize the waveforms of the clean, noisy, and enhanced audio to see the differences.

.. GENERATED FROM PYTHON SOURCE LINES 96-119

.. code-block:: Python
   :lineno-start: 96

    fig, axs = plt.subplots(3, 1, figsize=(12, 9))

    # Plot clean waveform
    axs[0].plot(clean_waveform.numpy())
    axs[0].set_title("Clean Audio Waveform (Sine Wave)")
    axs[0].set_xlabel("Time")
    axs[0].set_ylabel("Amplitude")

    # Plot noisy waveform
    axs[1].plot(noisy_waveform.numpy(), color="orange")
    axs[1].set_title(f"Noisy Audio Waveform (PESQ: {pesq_noisy.item():.4f})")
    axs[1].set_xlabel("Time")
    axs[1].set_ylabel("Amplitude")

    # Plot enhanced waveform
    axs[2].plot(enhanced_waveform.numpy(), color="green")
    axs[2].set_title(f"Enhanced Audio Waveform (PESQ: {pesq_enhanced.item():.4f})")
    axs[2].set_xlabel("Time")
    axs[2].set_ylabel("Amplitude")

    # Adjust layout for better visualization
    fig.tight_layout()
    plt.show()



.. image-sg:: /gallery/audio/images/sphx_glr_pesq_001.png
   :alt: Clean Audio Waveform (Sine Wave), Noisy Audio Waveform (PESQ: 3.0675), Enhanced Audio Waveform (PESQ: 1.2623)
   :srcset: /gallery/audio/images/sphx_glr_pesq_001.png
   :class: sphx-glr-single-img






.. rst-class:: sphx-glr-timing

   **Total running time of the script:** (0 minutes 0.620 seconds)


.. _sphx_glr_download_gallery_audio_pesq.py:

.. only:: html

  .. container:: sphx-glr-footer sphx-glr-footer-example

    .. container:: sphx-glr-download sphx-glr-download-jupyter

      :download:`Download Jupyter notebook: pesq.ipynb <pesq.ipynb>`

    .. container:: sphx-glr-download sphx-glr-download-python

      :download:`Download Python source code: pesq.py <pesq.py>`

    .. container:: sphx-glr-download sphx-glr-download-zip

      :download:`Download zipped: pesq.zip <pesq.zip>`


.. only:: html

 .. rst-class:: sphx-glr-signature

    `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_