Mean-Shift Distillation for Diffusion Mode Seeking

Vikas Thamizharasan^1,2	Nikitas Chatzis³	Iliyan Georgiev²	Matthew Fisher²
Difan Liu²	Nanxuan Zhao²	Evangelos Kalogerakis^1,4	Michal Lukáč²

¹ University of Massachusetts Amherst

² Adobe Research

³ NTUA

⁴ TU Crete

Paper

Code

Illustrating the shortcomings of SDS and how we fix it.

Score distillation sampling (SDS) (Poole et al., 2022; Wang et al., 2022) has emerged as a useful technique for leveraging the priors learned by large-scale image models beyond 2D raster images. SDS provides an optimization procedure to estimate the parameters of a differentiable image generator, such that the rendered image is pushed towards a higher-probability region of a pre-trained prompt-conditioned image diffusion model.

SDS suffers from significant bias as well as variance, yielding inaccurate gradients. This manifests as over-smoothened results when optimizing with text-to-image diffusion models.

To illustrate the pitfalls of SDS, we simulate it in 2D using a small denoising diffusion network.
[Watch blog as video - 3 mins (MP4)]

We begin my training a simple score model: \(\epsilon_{\theta} \approx -\sigma_t \nabla_{z_t} \log p(z_t|c) \). Then, at inference time, we draw samples via DDIM (Song et al., 2021a), a popular first-order sampling algorithm, without and with guidance (CFG; Ho & Salimans, 2021).

What does this look like with SDS?

Let's simulate this with multiple points densely initialized along a grid, across the canvas. After several optimization steps, we observe, samples optimized with SDS fail to fit the distribution. GIF

Our fix.

We propose mean-shift distillation, a distribution-gradient proxy based on a well-known mode-seeking technique. GIF

Putting it all together...

Results with Stable Diffusion

Coming soon...

Bibtex

 @misc{thamizharasan2025meanshiftdistillationdiffusionmode,
 title={Mean-Shift Distillation for Diffusion Mode Seeking},
 author={Vikas Thamizharasan and Nikitas Chatzis and Iliyan Georgiev and Matthew Fisher and Difan Liu and Nanxuan Zhao and Evangelos Kalogerakis and Michal Lukac},
 year={2025},
 eprint={2502.15989},
 archivePrefix={arXiv},
 primaryClass={cs.LG},
 url={https://arxiv.org/abs/2502.15989},
}

The code of this website is heavily based on the template from visual.cs.brown.edu.