Mean-Shift Distillation for Diffusion Mode Seeking

Vikas Thamizharasan1,2 Nikitas Chatzis3 Iliyan Georgiev2 Matthew Fisher2
Difan Liu2 Nanxuan Zhao2 Evangelos Kalogerakis1,4 Michal Lukáč2
1 University of Massachusetts Amherst 2 Adobe Research 3 NTUA 4 TU Crete

Paper

Code

Illustrating the shortcomings of SDS and how we fix it.

Score distillation sampling (SDS) (Poole et al., 2022; Wang et al., 2022) has emerged as a useful technique for leveraging the priors learned by large-scale image models beyond 2D raster images. SDS provides an optimization procedure to estimate the parameters of a differentiable image generator, such that the rendered image is pushed towards a higher-probability region of a pre-trained prompt-conditioned image diffusion model.

SDS suffers from significant bias as well as variance, yielding inaccurate gradients. This manifests as over-smoothened results when optimizing with text-to-image diffusion models.

To illustrate the pitfalls of SDS, we simulate it in 2D using a small denoising diffusion network.
[Watch blog as video - 3 mins (MP4)]

GIF

We begin my training a simple score model: \(\epsilon_{\theta} \approx -\sigma_t \nabla_{z_t} \log p(z_t|c) \). Then, at inference time, we draw samples via DDIM (Song et al., 2021a), a popular first-order sampling algorithm, without and with guidance (CFG; Ho & Salimans, 2021).

GIF GIF

What does this look like with SDS?

GIF

Let's simulate this with multiple points densely initialized along a grid, across the canvas. After several optimization steps, we observe, samples optimized with SDS fail to fit the distribution. GIF

GIF

Our fix.

We propose mean-shift distillation, a distribution-gradient proxy based on a well-known mode-seeking technique. GIF

GIF

Putting it all together...

GIF

Results with Stable Diffusion

Coming soon...

Bibtex

 @misc{thamizharasan2025meanshiftdistillationdiffusionmode,
title={Mean-Shift Distillation for Diffusion Mode Seeking},
author={Vikas Thamizharasan and Nikitas Chatzis and Iliyan Georgiev and Matthew Fisher and Difan Liu and Nanxuan Zhao and Evangelos Kalogerakis and Michal Lukac},
year={2025},
eprint={2502.15989},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2502.15989},
}

The code of this website is heavily based on the template from visual.cs.brown.edu.