A research team from Hugging Face introduces a block pruning approach targeting both small and fast models, which learns to eliminate full components of the original model while effectively dropping a large number of attention heads.
Researchers from Carnegie Mellon University, the University of Texas at Austin and Facebook AI propose a novel paradigm to optimize widths for each CNN layer. The method is compatible across various width optimization algorithms and networks and achieves up to a 320x reduction in width optimization overhead without compromising top-1 accuracy on ImageNet.