Audio Declipping

A Survey and an Extensive Evaluation of Popular Audio Declipping Methods

Pavel Záviška, Pavel Rajmic, Alexey Ozerov, Lucas Rencker

Audio Declipping Performance Enhancement via Crossfading

Pavel Záviška, Pavel Rajmic, Ondřej Mokrý

Audio Declipping with (Weighted) Analysis Social Sparsity

Pavel Záviška, Pavel Rajmic

About

This is an website accompanying three articles about Audio Declipping (see the abstracts below). On this website you can find supplementary material such as a detailed description of the audio dataset used, additional plots with results for each individual audio excerpt, a link to the repository with MATLAB source codes, and last but not least, you can listen to the restored audio excerpts.

A Survey and an Extensive Evaluation of Popular Audio Declipping Methods

Abstract: Dynamic range limitations in signal processing often lead to clipping, or saturation, in signals. Audio declipping is the task of estimating the original audio signal given its clipped measurements and has attracted a lot of interest in recent years. Audio declipping algorithms often make assumptions about the underlying signal, such as sparsity or low-rankness, as well as the measurement system. In this paper, we provide an extensive review of audio declipping algorithms proposed in the literature. For each algorithm, we present the assumptions being made about the audio signal, the modeling domain, as well as the optimization algorithm. Furthermore, we provide an extensive numerical evaluation of popular declipping algorithms, on real audio data. We evaluate each algorithm in terms of the Signal-to-Distortion Ratio, as well as using perceptual metrics of sound quality. The article is accompanied with the repository containing the evaluated methods.

Full-text: IEEE Xplore, arXiv postprint.
Citations: Plain Text, BibTeX.

Audio declipping performance enhancement via crossfading

Abstract: Some audio declipping methods produce waveforms that do not fully respect the physical process of clipping, which is why we refer to them as inconsistent. This letter reports what effect on perception it has if the solution by inconsistent methods is forced consistent by postprocessing. We first propose a simple sample replacement method, then we identify its main weaknesses and propose an improved variant. The experiments show that the vast majority of inconsistent declipping methods significantly benefit from the proposed approach in terms of objective perceptual metrics. In particular, we show that the SS PEW method based on social sparsity combined with the proposed method performs comparable to top methods from the consistent class, but at a computational cost of one order of magnitude lower.

Full-text: ScienceDirect, arXiv preprint.
Citations: Plain Text, BibTeX.

Audio Declipping with (Weighted) Analysis Social Sparsity

Abstract: We develop the analysis (cosparse) variant of the popular audio declipping algorithm of Siedenburg et al. (2014). Furthermore, we extend both the old and the new variants by the possibility of weighting the time-frequency coefficients. We examine the audio reconstruction performance of several combinations of weights and shrinkage operators. The weights are shown to improve the reconstruction quality in some cases; however, the best scores achieved by the non-weighted methods are not surpassed with the help of weights. Yet, the analysis Empirical Wiener (EW) shrinkage was able to reach the quality of a computationally more expensive competitor, the Persistent Empirical Wiener (PEW). Moreover, the proposed analysis variant incorporating PEW slightly outperforms the synthesis counterpart in terms of an auditorily motivated metric.

Full-text: IEEE Xplore, arXiv postprint.
Citations: Plain Text, BibTeX.

Reproducible Research

Following the idea of reproducible research, we make all the implementations freely available at the GitHub repository.
Please note that LTFAT toolbox (version>=2.4.0, available here) must be installed and loaded in order to run the scripts and reproduce the results.

Algorithms

The following table contains abbreviations and full names of the algorithms used in the evaluation.
Algorithms inconsistent in the reliable part of the clipped signal are marked with an asterisk *.
Note that the analysis variant of Social Sparsity declipper (ASS) was introduced later in the above-mentioned conference paper and thus it is not part of the Declipping Survey nor the Crossfading article.

Abbreviation	Full name
C-OMP*	Constrained Orthogonal Matching Pursuit
A-SPADE	Analysis SParse Audio DEclipper
S-SPADE	Synthesis SParse Audio DEclipper
ℓ₁ CP	ℓ₁-minimization using Cambolle–Pock
ℓ₁ DR	ℓ₁-minimization using Douglas–Rachford
Rℓ₁CC CP	Reweighted ℓ₁-minimization with Clipping Constraints using Chambolle–Pock (analysis)
Rℓ₁CC DR	Reweighted ℓ₁-minimization with Clipping Constraints using Douglas–Rachford (synthesis)
SS EW*	Social Sparsity with Empirical Wiener
SS PEW*	Social Sparsity with Persistent Empirical Wiener
CSL1*	Compressed Sensing method minimizing ℓ₁-norm
PCSL1*	Perceptual Compressed Sensing method minimizing ℓ₁-norm
PWCSL1*	Parabola-Weighted Compressed Sensing method minimizing ℓ₁-norm
PWℓ₁ CP	Parabola-Weighted ℓ₁-minimization using Chambolle–Pock (analysis)
PWℓ₁ DR	Parabola-Weighted ℓ₁-minimization using Douglas–Rachford (synthesis)
DL*	Dictionary Learning approach
NMF	Nonnegative Matrix Factorization
Janssen	Janssen method for inpainting
ASS EW*	Analysis Social Sparsity with Empirical Wiener
ASS PEW*	Analysis Social Sparsity with Persistent Empirical Wiener

Audio Excerpts

The audio database used for the evaluation consists of 10 musical excerpts in mono, sampled at 44.1 kHz, with an approximate length of 7 seconds. They were extracted from the EBU SQAM database. The excerpts were thoroughly selected to cover a wide range of audio signal characteristics. Since a significant number of methods is based on signal sparsity, the selection took care about including different levels of sparsity in the signals (w.r.t. the Gabor transform).

01. violin

02. clarinet

03. bassoon

04. harp

05. glockenspiel

06. celesta

07. accordion

08. guitar

09. piano

10. wind ensemble

The table below contains listenable excerpts from all three articles. It is possible to select the initial level of degradation (input SDR) and the displayed evaluation metric. The postprocessing switch is relevant only for the algorithms inconsistent in the reliable part, which are marked with *. To listen to results related only to the Audio Declipping Survey, leave the option “Inconsistent restoration” switched on. The other two options relate to methods presented in the article Audio declipping performance enhancement via crossfading.

The playback can be started by clicking on one of the table cells (the cells turn light blue when the cursor hovers over them). Your browser must support HTML5 audio player. Alternativelly, the file path is shown below the player and it can be downloaded by Save Link As ...

Select input SDR: 1 dB 3 dB 5 dB 7 dB 10 dB 15 dB 20 dB
Select table values: None ∆SDR_c PEAQ ODG PEMO-Q ODG Rnonlin
Select postprocessing of reliable samples: Inconsistent restoration Replace Reliable Crossfaded Replace

Loaded file: None

	01	02	03	04	05	06	07	08	09	10
Original	X	X	X	X	X	X	X	X	X	X
Clipped	X	X	X	X	X	X	X	X	X	X
C-OMP*	X	X	X	X	X	X	X	X	X	X
A-SPADE	X	X	X	X	X	X	X	X	X	X
S-SPADE	X	X	X	X	X	X	X	X	X	X
ℓ₁ CP	X	X	X	X	X	X	X	X	X	X
ℓ₁ DR	X	X	X	X	X	X	X	X	X	X
Rℓ₁CC CP	X	X	X	X	X	X	X	X	X	X
Rℓ₁CC DR	X	X	X	X	X	X	X	X	X	X
SS EW*	X	X	X	X	X	X	X	X	X	X
SS PEW*	X	X	X	X	X	X	X	X	X	X
CSL1*	X	X	X	X	X	X	X	X	X	X
PCSL1*	X	X	X	X	X	X	X	X	X	X
PWCSL1*	X	X	X	X	X	X	X	X	X	X
PWℓ₁ CP	X	X	X	X	X	X	X	X	X	X
PWℓ₁ DR	X	X	X	X	X	X	X	X	X	X
DL*	X	X	X	X	X	X	X	X	X	X
NMF	X	X	X	X	X	X	X	X	X	X
Janssen	X	X	X	X	X	X	X	X	X	X
ASS EW*	X	X	X	X	X	X	X	X	X	X
ASS PEW*	X	X	X	X	X	X	X	X	X	X

Individual results

The following figure presents supplementary results to the Audio Declipping Survey of the declipping algorithms for each audio excerpt individually. The respective audio excerpt and objective metric can be selected using the buttons below. In the plots that follow, algorithms coming from the same family share the same color. If a method was examined in both the analysis and the synthesis variant, the analysis variant is graphically distinguished via squared markers. Other variants (e.g., multiple shrinkage operators in the SS algorithms or different weights within the CSL family) diamond or triangle markers.

Select audio excerpt: 01 02 03 04 05 06 07 08 09 10
Select objective metric: ∆SDR_c PEAQ PEMO-Q Rnonlin

Average ∆SDR_c results, i.e., SDR improvement computed on the clipped samples only.

Overall results of the survey

In the following figures, the performance of the declipping algorithms is presented. The comparison is done in terms of four objective metrics — ∆SDR_c (SDR improvement computed on the clipped samples only), PEAQ, PEMO-Q, and Rnonlin. In the bar graphs that follow, algorithms coming from the same family share the same color. If a method was examined in both the analysis and the synthesis variant, the analysis variant is graphically distinguished via hatching. Other variants (e.g., multiple shrinkage operators in the SS algorithms or different weights within the CSL family) use gray stippling.

Select order of the bar graphs: Algorithm-sorted Value-sorted

Average ∆SDR_c results, i.e., SDR improvement computed on the clipped samples only.

Average PEAQ ODG results. The PEAQ ODG of the clipped signal is depicted in gray.

Average PEMO-Q ODG results.

Average Rnonlin results.

Results of the replacing approach

The figures below present average PEAQ ODG and PEMO-Q ODG values of the replacing approach proposed in the article Audio Declipping Performance Enhancement via Crossfading. The individual declipping algorithms are distinguished using different bar colors. Within a single bar, the lightest shade represents the quality of the originally declipped signal, i.e., inconsistent in the reliable part. The respective medium shade marks the results of the Replace Reliable (RR) strategy, and finally, the darkest shade corresponds to Crossfaded Replace (CR). In addition, the black dotted lines represent the average ODG value of the clipped signals, and each black dashed line indicates the best result obtained in the Audio Declipping Survey.

Audio Declipping

A Survey and an Extensive Evaluation of Popular Audio Declipping Methods

Pavel Záviška, Pavel Rajmic, Alexey Ozerov, Lucas Rencker

Audio Declipping Performance Enhancement via Crossfading

Pavel Záviška, Pavel Rajmic, Ondřej Mokrý

Audio Declipping with (Weighted) Analysis Social Sparsity

Pavel Záviška, Pavel Rajmic

About

A Survey and an Extensive Evaluation of Popular Audio Declipping Methods

Audio declipping performance enhancement via crossfading

Audio Declipping with (Weighted) Analysis Social Sparsity

Reproducible Research

Algorithms

Audio Excerpts

01. violin

02. clarinet

03. bassoon

04. harp

05. glockenspiel

06. celesta

07. accordion

08. guitar

09. piano

10. wind ensemble

Individual results

Average ∆SDR_c results, i.e., SDR improvement computed on the clipped samples only.

Overall results of the survey

Average ∆SDR_c results, i.e., SDR improvement computed on the clipped samples only.

Average PEAQ ODG results. The PEAQ ODG of the clipped signal is depicted in gray.

Average PEMO-Q ODG results.

Average Rnonlin results.

Results of the replacing approach

Average PEAQ results.

Average PEMO-Q results.

Average Rnonlin results.

Audio Declipping

A Survey and an Extensive Evaluation of Popular Audio Declipping Methods

Pavel Záviška, Pavel Rajmic, Alexey Ozerov, Lucas Rencker

Audio Declipping Performance Enhancement via Crossfading

Pavel Záviška, Pavel Rajmic, Ondřej Mokrý

Audio Declipping with (Weighted) Analysis Social Sparsity

Pavel Záviška, Pavel Rajmic

About

A Survey and an Extensive Evaluation of Popular Audio Declipping Methods

Audio declipping performance enhancement via crossfading

Audio Declipping with (Weighted) Analysis Social Sparsity

Reproducible Research

Algorithms

Audio Excerpts

01. violin

02. clarinet

03. bassoon

04. harp

05. glockenspiel

06. celesta

07. accordion

08. guitar

09. piano

10. wind ensemble

Individual results

Average ∆SDRc results, i.e., SDR improvement computed on the clipped samples only.

Overall results of the survey

Average ∆SDRc results, i.e., SDR improvement computed on the clipped samples only.

Average PEAQ ODG results. The PEAQ ODG of the clipped signal is depicted in gray.

Average PEMO-Q ODG results.

Average Rnonlin results.

Results of the replacing approach

Average PEAQ results.

Average PEMO-Q results.

Average Rnonlin results.

Average ∆SDR_c results, i.e., SDR improvement computed on the clipped samples only.

Average ∆SDR_c results, i.e., SDR improvement computed on the clipped samples only.