Google & CMU’s Semantic Pyramid AutoEncoder Marks the First Successful Attempt for Multimodal Generation with Frozen LLMs
In a new paper SPAE: Semantic Pyramid AutoEncoder for Multimodal Generation with Frozen LLMs, a research team from Google Research and Carnegie Mellon University introduces Semantic Pyramid AutoEncoder (SPACE), the first successful method for enabling frozen LLMs to solve cross-modal tasks.