Tag: diffusion models

AI Machine Learning & Data Science Research

Google’s E3 TTS Provides Effortless Approach to High-Quality Audio Synthesis Through Diffusion Models

In a new paper E3 TTS: Easy End-to-End Diffusion-based Text to Speech, a Google research team proposes Easy End-to-End Diffusion-based Text to Speech. This streamlined and efficient text-to-speech model hinges solely on diffusion to preserve temporal structure, allowing it to accept plain text as input and generate audio waveforms directly.

AI Machine Learning & Data Science Research

Almost 7X Cheaper! Colossal-AI’s Open Source Solution Accelerates AIGC at a Low-Cost Diffusion Pretraining and Hardware Fine-Tuning Can Be

Colossal-AI releases a complete open-source Stable Diffusion pretraining and fine-tuning solution that reduces the pretraining cost by 6.5 times, and the hardware cost of fine-tuning by 7 times, while simultaneously speeding up the processes! The fine-tuning task flow can also be conveniently completed on an RTX 2070/3050 PC.

AI Machine Learning & Data Science Research

Stanford U & Google Brain’s Classifier-Free Guidance Model Diffusion Technique Reduces Sampling Steps by 256x

In the new paper On Distillation of Guided Diffusion Models, researchers from Google Brain and Stanford University propose a novel approach for distilling classifier-free guided diffusion models with high sampling efficiency. The resulting models achieve performance comparable to the original model but with sampling steps reduced by up to 256 times.

AI Machine Learning & Data Science Research

OpenAI’s unCLIP Text-to-Image System Leverages Contrastive and Diffusion Models to Achieve SOTA Performance

In the new paper Hierarchical Text-Conditional Image Generation with CLIP Latents, an OpenAI research team combines the advantages of contrastive and diffusion models for text-conditional image generation tasks. Their proposed unCLIP model improves image diversity with minimal loss in photorealism and caption similarity, and produces image quality comparable to the state-of-the-art text-to-image system GLIDE.