In a new paper, researchers from the University of Cambridge and Facebook AI introduce ProoFVer, a natural logic-based fact verification system that provides faithful explanations and outperforms state-of-the-art fact verification models on rationale extraction.
Fact verification systems are designed to determine a given claim’s veracity. Recent advancements in neural networks for entailment recognition have improved fact verification performance, but these black-box models lack transparency, i.e. they cannot provide the reasoning behind their decision-making processes. Traditional logic-based approaches meanwhile can provide explicit proofs via natural logic to determine the entailment of a given input sentence, but their overall performance lags behind neural approaches.
The proposed ProoFVer (Proof System for Fact Verification using natural logic) aims at the best of both worlds: preserving both performance and explainability. The researchers say ProofVer is the first system to generate valid natural logic-based logical inferences as its proofs. A clear advantage to this method is that the natural logic paradigm operates directly on natural language.
The ProoFVer system comprises a seq2seq proof generator that generates proofs in the form of natural logic-based logical inferences, and deterministic finite automaton (DFA) that predicts the veracity of a claim.
Given a claim along with one or more evidence sentences retrieved from a knowledge source, the proof generator (seq2seq) generates the steps in the proof as a sequence of triples, where each triple contains a span from the claim, a span from the evidence, and “NatOps” representing the natural logic operators equivalence (”), forward-entailment (Ď), reverse entailment (Ě), negation (N), alternation (ê), cover (!), and independence (#).
Based on the generated proof, the DFA then determines whether the evidence supports or refutes the claim.
The evidence spans may come from multiple sentences. The first three mutations use spans from “Evidence-1″, while the last uses a span from “Evidence-2”. Each mutation is marked with the NatOp that holds between the claim and the evidence spans.
Training the proof generator requires an annotated proof, but manually obtaining these annotations would be laborious. The team therefore performs a three step-annotation process — chunking, alignment and entailment assignment — to alleviate the laborious annotation-obtaining process.
The claim spans are obtained using the chunking step; alignment provides the mutations by obtaining the corresponding evidence spans for each claim span; and the entailment assignment step assigns NatOps to the mutations.
The team evaluated ProoFVer on the FEVER benchmark dataset for fact verification, and reported label accuracy (LA), FEVER Score and Stability Error Rate (SER) to show the model’s veracity classification accuracy, robustness and overstability, respectively. Baseline systems used in the tests were KernelGAT, CorefBERT and DREAM.
In the evaluations, ProoFVer achieved a label accuracy of 79.25 percent and a FEVER Score of 74.37 percent on the FEVER test data, outperforming current state-of-the-art fact-verification models by 2.4 percent and 2.07 percent respectively. With regard to the faithfulness of ProoFVer’s explanations, human annotators correctly predicted the model outcome 82.78 percent of the time based on ProoFVer’s explanations, against 70 percent when using only the evidence.
Overall, the proposed ProoFVer approach outperformed the state-of-the-art models in both accuracy and robustness — providing faithful explanations that scored high on rationale extraction when compared to attention based highlights, while also improving human understanding of the decision-making process.
The paper ProoFVer: Natural Logic Theorem Proving for Fact Verification is on arXiv.
Author: Hecate He | Editor: Michael Sarazen, Chain Zhang
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.