AI Machine Learning & Data Science Research

Futureverse’ Universal High-Quality Text-to-Music Generator JEN-1 Makes Significant Advancements

In a new paper JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models, a Futureverse research team presents JEN-1, a universal framework that combines bidirectional and unidirectional modes to generate high-quality music conditioned on either text or music representations.

Music, an artistic universal expression of mankind carry deep culture significance and appeal to humans in our civilization world. There has been a significant progress of deep generative models for generating music. However, generating high-fidelity and realistic music conditioned on free-form textual descriptions, known as text-to-music, remains challenging.

In a new paper JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models, a Futureverse research team presents JEN-1, a universal framework that combines bidirectional and unidirectional modes to generate high-quality music conditioned on either text or music representations, achieving new state-of-the-art results in text-music alignment and music quality with increasing computational costs.

The team summarizes their key contributions as follows:

  1. We propose JEN-1 as a solution to the challenging text-to-music generation task. JEN1 employs in-context learning and is trained with multi-task objectives, enabling music generation, music continuation, and music inpainting within a single model.
  2. JEN-1 utilizes an extremely efficient approach by directly modeling waveforms, avoiding the conversion loss associated with spectrograms.
  3. Our JEN-1 model integrates both autoregressive diffusion mode and non-autoregressive mode to improve sequential dependency and enhance sequence generation concurrently.
  4. Our paper presents a significant advancement in the field of text-to-music generation, offering a powerful, efficient, and controllable framework for generating high-quality music aligned with textual prompts and melodic structures.

JEN-1 combines bidirectional and unidirectional modes to offer a unified approach for universal text-to-music generation. And unlike previous generation models that discrete tokens or involve multiple serial stages, JEN-1 uses a novel framework to enable continuous, high-fidelity music generation using a single model.

Moreover, JEN-1 utilizes both autoregressive to improve sequential dependency and non-autoregressive training to improve sequence generation concurrently. Specifically, JEN-1 leverages a temporal 1D efficient U-Net to effectively model the waveform and implement the desired blocks in the diffusion model. The researchers further a novel omnidirectional latent diffusion model to achieve multi-task training. JEN-1 also integrates the unidirectional diffusion mode to inherent sequential characteristic of music.

In their empirical study, the team compares JEN-1 with state-of-the-art methods, including Riffusion, Mousai, MusicLM, MusicGen and Noise2Music. JEN-1 surpasses all SOTA baselines in terms of subjective quality, diversity, and controllability.

Overall, this work moves steps forward of text-to-music generation progress and introduces a powerful text-to-music generator. The team hopes their work will encourage more research on developing generative models to create impactful and realistic art.

The demon is available at futureverse.com. The paper JEN-1: Text-Guided Universal Music Generation with Omnidirectional Diffusion Models on arXiv.


Author: Hecate He | Editor: Chain Zhang


We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

2 comments on “Futureverse’ Universal High-Quality Text-to-Music Generator JEN-1 Makes Significant Advancements

  1. Pingback: Futureverse’ Universal High-Quality Text-to-Music Generator JEN-1 Makes Significant Advancements

  2. Pingback: Futureverse’ Universal High-Quality Text-to-Music Generator JEN-1 Makes Significant Advancements – Ai Headlines

Leave a Reply

Your email address will not be published. Required fields are marked *

%d bloggers like this: