NVIDIA’s STEERLM Approach: Empowering User-Steerable Language Models
In a new paper SteerLM: Attribute Conditioned SFT as an (User-Steerable) Alternative to RLHF, an NVIDIA research team introduces STEERLM, a novel supervised fine-tuning method that empowers end-users to control model responses during inference, surpassing even state-of-the-art baselines, including RLHF models like ChatGPT-3.5.