Tag: hyperparameter tuning

AI Machine Learning & Data Science Research

Microsoft & OpenAI’s µTransfer Zero-Shot Hyperparameter Transfer Method Tunes GPT-3’s Hyperparameters on a Single GPU

In the new paper Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer, Microsoft and OpenAI researchers propose µTransfer, a method that leverages Maximal Update Parametrization (µP) to zero-shot transfer hyperparameters from small models and obtain near-optimal parameters on large models without directly tuning them.

AI Machine Learning & Data Science Research

Meet Hyper-Tune: New SOTA Efficient Distributed Automatic Hyperparameter Tuning at Scale

A research team from Peking University, ETH Zürich and Kuaishou Technology proposes Hyper-Tune, an efficient and robust distributed hyperparameter-tuning framework that features system optimizations such as automatic resource allocation, asynchronous scheduling and a multi-fidelity optimizer, and achieves state-of-the-art performance on multiple tuning tasks.