Microsoft Releases DeepSpeed-Chat for RLHF Training of ChatGPT-like Models

ChatGPT like models have revolutionized the artificial intelligence work by their incredible capabilities for solving real world tasks like summarization, coding, and translation, achieving on-par or even surpassing human experts performance. Despites the impressive capabilities of these models, there is still a lack of an end-to-end Reinforcement Learning with Human Feedback (RLHF) pipeline for training ChatGPT like model.

In a new paper DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales, a Deepspeed of Microsoft research team presents DeepSpeed-Chat, a novel end-to-end RLHF pipeline that provides easy-to-use training and inference for ChatGPT-like models while delivering unparalleled efficiency and scalability for training models that have hundreds of billions of parameters.

The team summarizes the proposed DeepSpeed-Chat with the following three capabilities :

Easy-to-use Training and Inference Experience for ChatGPT Like Models.
DeepSpeed-RLHF Pipeline that replicates the training pipeline from the InstructGPT paper with careful attention to ensure completeness and one-to-one correspondence.
DeepSpeed-RLHF System that combines the training and inference prowess of DeepSpeed into single unified Hybrid Engine (DeepSpeedHE) for RLHF.

The team stars by showing how easily to train OPT-13B and OPT-66B models with DeepSpeed-RLHF system, as well as how to leverage DeepSpeed-chat RLHF API to customarize user-defined pipelines. Specifically, only one script is needed to completes all three stages: 1) Supervised Finetuning (SFT), 2) Reward Model Fine-tuning and 3) RLHF to build user’s own ChatGPT like model. They also provide flexible APIs that enable a general interface and backend for users to build their own RLHF training pipeline at ease.

Moreover, the researchers combine the full system capability of DeepSpeed Training and Inference into a unified architecture which they call Hybrid Engine. The engine uses a light-weight memory management system to significantly boost throughput and enable memory optimization techniques to deliver high training efficiency. It also supports tensor-parallelism and ZeRO-based sharding mechanism that cut substantial costs to deliver unparalleled scale and system efficiency for RLHF workloads.

Overall, DeepSpeed-Chat system offers easy, efficient, affordable and excellent scalability for RLHF training of ChatGPT-like models, the team has open-sourced DeepSpeed-Chat and they open to collaborations with AI community to work on applying DeepSpeed on real-world applications.

The code is available in project’s GitHub. The paper DeepSpeed-Chat: Easy, Fast and Affordable RLHF Training of ChatGPT-like Models at All Scales on arXiv.

Author: Hecate He | Editor: Chain Zhang

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

3 comments on “Microsoft Releases DeepSpeed-Chat for RLHF Training of ChatGPT-like Models”

Henry Larry

2023-11-28

Exciting advancements in RLHF training with DeepSpeed-Chat! Looking forward to witnessing how this innovation further enhances ChatGPT-like models’ capabilities in real-world applications.
Best Plumbing Services in New River AZ

Loading...

Patrick893

2024-10-17

very good article thank you for sharing, i like it very much slope ball game

Loading...

Miana

2024-11-07

It was really captivating from the first lines to the last pages; slopewas impressed with the way you developed the plot and built the character

Loading...

Microsoft Releases DeepSpeed-Chat for RLHF Training of ChatGPT-like Models

Like this:

3 comments on “Microsoft Releases DeepSpeed-Chat for RLHF Training of ChatGPT-like Models”

Leave a Reply Cancel reply

Related

Share this:

Like this:

3 comments on “Microsoft Releases DeepSpeed-Chat for RLHF Training of ChatGPT-like Models”

Leave a Reply Cancel reply

Related