Image Denoising Benchmark

Initial release: March 25, 2010

Last updated: March 25, 2010

This benchmark is the result of very thorough feedback received from multiple reviewers while developing the stochastic denoising algorithm. It supercedes the results presented in our BMVC paper. The benchmark differs from that presented in BMVC in three ways:

* Wider and more dense sampling of algorithm parameters
* Separate training and testing sets at each noise level
* Tested on a wider range of input noise values

Below, you will find scripts and data sets needed to replicate the results presented here, these will also allow you to benchmark and compare other denoising methods.


We use for testing images from the Berkeley Segmentation Database (BSD). Please note we do not distribute the original BSD images, you will need to download them directly from the BSD web site. For each of the BSD images, we have generated a series of noisy versions with different amounts of Gaussian noise. Noise standard deviations used in this benchmark are 5, 10, 15, 25, and 35 gray levels.

We evaluated the following denoising methods:

* Stochastic denoising (SD)
* Block matching (BM)
* Non-local means (NL)
* Bilateral filter (BF)
* Total variation (TV)


We use 20 BSD images (uniformly sampled from the entire collection) for training. Training consists of determining the parameters that yield the highest PSNR or SSIM values over the training set.

Parameters tested for each of the algorithms are as follows (references for each method are at the bottom of this page):

* SD : noise sigma, and random walk stopping threshold
* BM : noise sigma
* NL : T parameter and local window size
* BF : Spatial sigma and brightness sigma
* TV : Lambda parameter for the total variation functional

Training curves for each of the algorithms are shown below(click on any image to expand). We show training curves for optimal PSNR and SSIM separately. The curve for each algorithm and noise level shows the median value of the corresponding measure as a function of algorithm parameters on the training image set. The curves below effectively characterize the performance of each method over the range on input parameters tested.

Parameter tuning results for optimal PSNR

Training curves for PSNR
sig=5 SD sig5 PSNR BM sig 5 PSNR NL sig5 PSNR BF sig5 PSNR TV sig5 PSNR
sig=10 SD sig10 PSNR BM sig10 PSNR NL sig10 PSNR BF sig10 PSNR TV sig10 PSNR
sig=15 SD sig15 PSNR BM sig15 PSNR NL sig15 PSNR BF sig15 PSNR TV sig15 PSNR
sig=25 SD sig25 PSNR BM sig25 PSNR NL sig25 PSNR BF sig25 PSNR TV sig25 PSNR
sig=35 SD sig35 PSNR BM sig35 PSNR NL sig35 PSNR BF sig35 PSNR TV sig35 PSNR


Parameter tuning results for optimal SSIM

Training curves for PSNR
sig=5 BM sig 5 SSIM NL sig5 SSIM BF sig5 SSIM TV sig5 SSIM
sig=10 SD sig10 SSIM BM sig10 SSIM NL sig10 SSIM BF sig10 SSIM TV sig10 SSIM
sig=15 SD sig15 SSIM BM sig15 SSIM NL sig15 SSIM BF sig15 SSIM TV sig15 SSIM
sig=25 SD sig25 SSIM BM sig25 SSIM NL sig25 SSIM BF sig25 SSIM TV sig25 SSIM
sig=35 SD sig35 SSIM BM sig35 SSIM NL sig35 SSIM BF sig35 SSIM TV sig35 SSIM

Performance on the Training Set

The curves below show algorithm performance on the training set for each of the methods as a function of noise level. Algorithm parameters were set to the optimal value taken from the performance curves above. BM shows the best performance across all levels, followed by SD and NL. While SD shows better performance at noise levels of up to 15 gray levels, NL provides better PSNR and SSIM at higher noise levels. BF and TV serve as a baseline and represent earlier denoising algorithms.

on the training set
Training results PSNR iTraining results SSIMi


Testing Results

The remaining 280 images from the Berkeley Segmentation Database were used to test the performance of the denoising algorithms. For each method and each noise level, we use the optimal parameters obtained from the training curves shown above. The images below show testing results for each method as a function of noise level.

on the training set
Testing results PSNR iTesting results SSIMi

The results show that BM obtains the best performance across all noise levels. Once more, SD and NL follow, with SD achieving better performance at lower noise levels, and NL providing better results at higher noise. Clearly the use of image patches both by BM and NL is beneficial at higher noise levels. The pixel-based SD method is not able to preserve texture and structure as well as BM or NL for higher noise levels.

Sample Denoising Results

The images below show typical denoising results for each method at noise levels 10 and 35. These provide a visual indication of the relative quality of the denoised images produced by each of the algorithms.

Noise at 10 gray levels

Input Ref SD BM NL BF TV

Noise at 35 gray levels

Input Ref SD BM NL BF TV


Scripts, code, and data

The compressed archive below contains all the scripts and data needed to replicate this benchmark. The scripts should allow for easily testing other denoising methods. The data files produced by these scripts for each of the algorithms tested are attached. In this manner, any algorithm or subset of algorithms can be tested without the need for running the entire benchmark on all methods to produce the required data for plotting training curves and performance results.

Please note we do not distribute the original BSD images, these must be downloaded from the Berkeley Database and Benchmark. A script is provided to convert these files to .tif with a corresponding numeric filename for use with all the scripts from the benchmark. Additionally, in order to re-generate all the results shown above from scratch, each of the denoising algorithms must be installed and added to the Matlab path. Finally, the SSIM measure and a suitable function to compute PSNR must be available and within the Matlab search path.

The scripts below are, needless to say, freely provided for research purposes only.

Benchmark scripts, images, and data (.tgz, 748 Mb)

Stochastic Denoising executables (multi-platform)


BM distribution

NL distribution

BF distribution

TV distribution

SSIM distribution



- A. Buades, B. Coll, and J.Morel. A non-local algorithm for image denoising. In CVPR, 2005.

- A. Chambolle. An algorithm for total variation minimization and applications. Journal of Mathematical Imaging and Vision, 2004.

- K. Davob, R. Foi, V. Katkovnik, and K. Egiazarian. Image denoising by sparse 3D transform-domain collaborative filtering. In TIP, 2007.

- F. Estrada, D. Fleet, and A. Jepson, "Stochastic Image Denoising", in BMVC, 2009.

- C. Tomasi and R. Manduchi. Bilateral filtering for gray and color images. In ICCV, 1998.


Comments and feedback

Please feel free to send back comments and feedback to the author. Also, I will be happy to add other denoising algorithms to the benchmark. To allow for full reproducibility, please send me the data files produced by the benchmark scripts on additional denoising methods, as well as links to the distributions used for testing. I will then update the benchmark shown here. I will try to evaluate other methods and update this page as time allows.