Text-to-image diffusion models have emerged as powerful tools for generative tasks, consistently delivering remarkable results in terms of generating high-quality and diverse images. However, these models often rely on an iterative refinement process that demands a substantial number of iterations, posing challenges for efficient use.
In response to this challenge, in a new paper Conditional Diffusion Distillation, a research team from Google Research and Johns Hopkins University introduces an innovative framework that distills an unconditional diffusion model into a conditional one, enabling image generation with significantly fewer steps. This results in higher-quality images with the same number of sampling steps when compared to previous two-stage distillation and fine-tuning techniques.


The proposed distilled model takes cues from given image conditions to predict high-quality results in just 1 to 4 sampling steps. This streamlined approach eliminates the need for the original text-to-image data, a prerequisite in previous distillation procedures, making the method more practical. Furthermore, the formulation introduced in this research avoids compromising the diffusion prior, a common issue in the initial stage of the fine-tuning-first procedure.

The central idea behind this work is to optimize an adapted conditional diffusion model from a pre-trained unconditional diffusion model. This optimization aims to ensure two key properties: self-consistency and the ability to generate samples from conditional data. The adapted diffusion model is then fine-tuned with new conditional data, using a conditional diffusion distillation loss that penalizes the difference between the predicted signal and the corresponding image, employing a distance function for distillation learning.

Additionally, this method allows for selective parameter updates related to distillation and conditional fine-tuning, while keeping other parameters frozen. This approach introduces a novel form of parameter-efficient conditional distillation, streamlining the distillation process across commonly-used parameter-efficient diffusion model fine-tuning.


The researchers validate the effectiveness of their approach across various conditional generation tasks, including real-world super-resolution, depth-to-image generation, and instructed image editing. Empirical results showcase that their method outperforms existing distillation techniques within the same sampling time. Importantly, this method represents the first distillation strategy capable of matching the performance of much slower fine-tuned conditional diffusion models.
In summary, this work demonstrates that only a small number of additional parameters are required for each distinct conditional generation task. The team envisions that their method can serve as a practical and potent approach to accelerate large-scale conditional diffusion models, marking a significant advancement in the field of generative models.
The paper Conditional Diffusion Distillation on arXiv.
Author: Hecate He | Editor: Chain Zhang

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

Our journey began with a simple yet ambitious goal: to create the best New York-style pizza. Founded by a team of pizza aficionados who have spent years perfecting their craft, our pizzeria is a testament to our passion for quality and authenticity. We believe that every slice should tell a story – a story of tradition, craftsmanship, and the vibrant energy of New York City. Visit us.
Dat eeuwige gewacht op diffusion models maakt me gek, prompt invoeren, 50 steps, nog niet goed, weer img2img, weer wachten, uren later nog niks en alleen frustratie. Laatst crashte alles weer, VRAM vol, prompt na prompt prut, ik was er helemaal klaar mee en wilde iets dat direct resultaat geeft. Om te resetten opende ik rocky spin app , vooral omdat ze voor spelers uit België speciale bonussen hebben die je meteen een boost geven. Ik ging voor San Quentin xWays, verloor eerst een reeks spins zonder feature, net als die AI-traagheid. Maar ik verhoogde mijn inzet en bam, bonus met unlimited multipliers en jumping wilds, uitbetaling van 1200x in één ronde. In seconden verdiend wat ik normaal in uren wachten verlies. Nu is dat casino mijn snelle kick als diffusion weer te traag is.