Tag: sampling

AI Machine Learning & Data Science Nature Language Tech Research

DeepMind’s Speculative Sampling Achieves 2–2.5x Decoding Speedups in Large Language Models

In the new paper Accelerating Large Language Model Decoding with Speculative Sampling, a DeepMind research team presents SpS (Speculative Sampling), an algorithm that achieves 2–2.5x decoding speedups on a 70 billion parameter Chinchilla language model. The novel approach maintains sample quality and does not require any modifications to model parameters or architecture.