Meta AI’s Shepherd Criticize Language Model Outputs to Crash Hallucinations

In a new paper Shepherd: A Critic for Language Model Generation, a Meta AI research team presents Shepherd, a language model that are explicitly tuned to critique model generated outputs as well as to generate feedbacks to suggest improvements on solving the factuality, logical errors, coherence, and alignment issues.

Large language models (LLMs) continuously making progress in generating contextually and semantically meaningful texts, but they still suffer from the risks of generating incoherent, false, unreliable or even toxic texts. To address this issue, there is an increasing interests in teaching LLMs to be able to refine their generated outputs.

In a new paper Shepherd: A Critic for Language Model Generation, a Meta AI research team presents Shepherd, a language model that are explicitly tuned to critique model generated outputs as well as to generate feedbacks to suggest improvements on solving the factuality, logical errors, coherence, and alignment issues.

The team starts by gathering feedback from two communities: Stack Exchange and the Pushshift Reddit Dataset. They clean the dataset as a question-answer-critique triad, with a post’s title and the sub-title as a question, the corresponding top-level comments as answer, and the replies to these comments as critiques.

To curate valid critiques, the researchers employ several techniques: 1) keyword filtering to match answer to two cases: large accurate answer or answer contains inaccuracies; 2) user edit history to identify the case where the critique leads to a refinement of the original answer; 3) incorporate additional filters linked with community vote scores to further refine the data to improve critiques; 4) choose highest critique score to maintain diversity; 5) incorporate a profanity check and eliminate lower score comments to manage offensive language; 6) filter URLs, images, or videos to key it text-only; 7) identify and remove comments to preserve the integrity of the Q&A format.

To ensure high-quality data, the researchers further conduct several postprocessing: 1) remove examples with red flags; 2) remove feedback on error types of “Redundancy” and “Consistency with context”; 3) concatenate the feedback from different error types into a paragraph using natural words to better identify example with different error type. As a results, they obtain 1,317 high quality examples in total.

The team selected LLaMA-7B (Touvron et al., 2023) as the base model for Shepherd training. Given the questions and the corresponding answers generated by large language models, Shepherd is trained to critique the generated answer by detecting errors or providing insightful feedback.

In their empirical study, the team compared Shepherd with several state-of-the-art language models, such as Alpaca 7B (Taori et al., 2023), SelFee-7B (Ye et al., 2023) and ChatGPT (GPT-3.5 Turbo). Shepherd outperforms Alapca and SelFee by a large margin and matches the performance of ChatGPT. The team believes Shepherd will be helpful to improve generation quality and reduce hallucinations.

The paper Shepherd: A Critic for Language Model Generation on arXiv.

Author: Hecate He | Editor: Chain Zhang

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

14 comments on “Meta AI’s Shepherd Criticize Language Model Outputs to Crash Hallucinations”

Pingback: Meta AI’s Shepherd Criticize Language Model Outputs to Crash Hallucinations
Pingback: Meta AI’s Shepherd Criticize Language Model Outputs to Crash Hallucinations – Ai Headlines
geometry dash

2023-10-19

Your post gave me a lot of useful information. This is a topic that interests me as well, so I hope you will read the article.

Loading...

Reply
geometry dash

2023-10-19

Oh this is exactly what I need and am looking for. Hope you will continue to update more articles.

Loading...

Reply
Billie

2024-03-07

I look forward to the results and great Palworld Breeding Calculator things that Shepherd brings. It will become AI with superior features and capabilities

Loading...

Reply
stoppertinent

2024-10-08

great post!
google

Loading...

Reply
Jeanne

2024-10-15

nice google

Loading...

Reply
sarah lly

2025-03-04

We should be selective when using bob the robber

Loading...

Reply
herry lauu

2025-03-06

Meta AI has been actively addressing the Sprunked issue of hallucinations in large language models (LLMs) by developing mechanisms to critique and refine AI-generated outputs. Hallucinations, in the context of artificial intelligence, refer to instances where AI systems produce outputs that are factually incorrect or misleading.

Loading...

Reply
Emma Mack

2025-06-03

Unlike models that just generate text, Shepherd acts like a built-in critic or editor: it evaluates a given answer, identifies dead plate errors or weaknesses, and provides constructive feedback on how to improve it.

Loading...

Reply
Kontext Dev

2025-08-11

Kontext Dev helps you create high-quality images with AI local editing. Enjoy your free trial and easily refine details without starting from scratch.

Loading...

Reply
Flux Krea

2025-08-11

Unlock your creativity with Flux Krea — the innovative AI solution for generating lifelike images from text instantly.

Loading...

Reply
Heic To Jpg

2025-08-11

HEIC to JPG – Fast & Easy HEIC Image Converter

Loading...

Reply
Brat Generator

2025-08-19

Brat Generator lets you create brat-style images online with the iconic green background and brat font.

Loading...

Reply