In a new paper E3 TTS: Easy End-to-End Diffusion-based Text to Speech, a Google research team proposes Easy End-to-End Diffusion-based Text to Speech. This streamlined and efficient text-to-speech model hinges solely on diffusion to preserve temporal structure, allowing it to accept plain text as input and generate audio waveforms directly.
In a new paper Conditional Diffusion Distillation, a research team from Google Research and Johns Hopkins University introduces an innovative framework that distills an unconditional diffusion model into a conditional one, enabling image generation with significantly fewer steps.
Colossal-AI (https://github.com/hpcaitech/ColossalAI), the widely-used open-source library for training, inference and fine-tuning of large deep learning models, has released a new automatic parallelism feature and functionality that reduces hardware costs by up to 46 times for AI-Generate Content (AIGC) solutions.
Colossal-AI releases a complete open-source Stable Diffusion pretraining and fine-tuning solution that reduces the pretraining cost by 6.5 times, and the hardware cost of fine-tuning by 7 times, while simultaneously speeding up the processes! The fine-tuning task flow can also be conveniently completed on an RTX 2070/3050 PC.
In the new paper On Distillation of Guided Diffusion Models, researchers from Google Brain and Stanford University propose a novel approach for distilling classifier-free guided diffusion models with high sampling efficiency. The resulting models achieve performance comparable to the original model but with sampling steps reduced by up to 256 times.
In the new paper Hierarchical Text-Conditional Image Generation with CLIP Latents, an OpenAI research team combines the advantages of contrastive and diffusion models for text-conditional image generation tasks. Their proposed unCLIP model improves image diversity with minimal loss in photorealism and caption similarity, and produces image quality comparable to the state-of-the-art text-to-image system GLIDE.