Denoising diffusion probabilistic models (DDPMs) with classifier-free guidance such as DALL·E 2, GLIDE, and Imagen have achieved state-of-the-art results in high-resolution image generation. The downside to such models is that their inference process requires evaluating both a class-conditional model and an unconditional model hundreds of times, rendering them prohibitively compute-expensive for many real-world applications.
In the new paper On Distillation of Guided Diffusion Models, researchers from Google Brain and Stanford University propose a novel approach for distilling classifier-free guided diffusion models with high sampling efficiency. The resulting models achieve performance comparable to the original model but with sampling steps reduced by up to 256 times.
The researchers’ distillation approach comprises two steps: Given a trained guided teacher model, a single student model first matches the combined output of the teacher’s two diffusion models, and this learned student model is then progressively distilled to a fewer-step model. The resulting single distilled model can handle a wide range of different guidance strengths and enable efficient tradeoffs between sample quality and diversity.
The proposed sampling method employs a deterministic sampler and a novel stochastic sampling process. One deterministic sampling step is first applied with two times the original step length, and one stochastic step is then performed backward (i.e., perturb with noise) using the original step length. This approach was inspired by Karras et al.’s paperElucidating the Design Space of Diffusion-Based Generative Models, published earlier this year.
In their empirical study, the team applied their method to classifier-free guidance DDPMs and performed image generation experiments on the ImageNet 64×64 and CIFAR-10 datasets. The results show that the proposed approach can achieve “visually decent” samples using as few as one step and obtain FID/IS (Frechet Inception Distance/Inception) scores comparable to that of the original baseline models while being up to 256 times faster to sample from.
Overall, this work demonstrates the effectiveness of the proposed approach in addressing the high computational costs that have limited the deployment of denoising diffusion probabilistic models.
The paper On Distillation of Guided Diffusion Models is on arXiv.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.