Tag: model pruning

AI Machine Learning & Data Science Research

NVIDIA’s Minitron: Compressing Llama 3.1 and Mistral NeMo for Superior Performance in 4B and 8B Models

In a new paper LLM Pruning and Distillation in Practice: The Minitron Approach, an NVIDIA research team presents the Minitron compression strategy, which effectively produces a robust 4B model from Llama 3.1 8B and a cutting-edge Mistral-NeMo-Minitron-8B model derived from Mistral NeMo 12B.

AI Machine Learning & Data Science Research

Gem-Miner: Finding Lottery Tickets at Initialization and Bettering All Baselines at 19x Faster Speeds

In the new paper Rare Gems: Finding Lottery Tickets at Initialization, a research team from Carnegie Mellon University, MBZUAI, Petuum, Inc and the University of Wisconsin-Madison proposes GEM-MINER, an algorithm that finds sparse subnetworks at initialization trainable to accuracy that is comparable or better than iterative magnitude pruning (IMP) with warm-up.