AI Research

Stanford, Kyoto & Georgia Tech Model ‘Neutralizes’ Biased Language

Pryzant and other Stanford researchers partnered with researchers from Kyoto University and Georgia Institute of Technology to develop a novel natural language model that can identify and neutralize biased framings, presuppositions, attitudes, etc. in text.

While AI is delivering unprecedented progress and convenience, the increasing implementation of AI technologies has also triggered public fears regarding autonomous vehicle safety, data misuse and job losses. One of the latest concerns to capture mainstream media headlines is the danger of human social biases leaking into AI models.

Stanford Computer Science PhD student Reid Pryzant is well aware of the public skepticism. “Bias is one of these trust issues we need to solve in order to actually get the technology out there and into our society in a safe way,” Pryzant told Synced in an email.

What if AI could also be used to fight bias?

Pryzant and other Stanford researchers partnered with researchers from Kyoto University and Georgia Institute of Technology to develop a novel natural language model that can identify and neutralize biased framings, presuppositions, attitudes, etc. in text.

The paper Automatically Neutralizing Subjective Bias in Text introduces a pair of new sequence-to-sequence AI algorithms that automatically detect and edit inappropriately subjective text to a neutral point of view.

Subjective bias refers to inappropriate subjectivity — when an author’s opinion or point of view leaks into their writing in a setting that ought to be objective, Pryzant explained. The study focused on four domains — encyclopedias, news headlines, books, and political speeches.

The researchers identified subjective biases and developed their models based on Wikipedia’s neutral point of view (NPOV) policy, a set of community editing principles designed to ensure fair representation — proportionate and without editorial bias as far as possible. As one of Wikipedia’s fundamental principles, NPOV is battle-tested and used widely to debias text in the wild, which is why Pryzant and his co-authors chose to ground their study in NPOV.

The algorithms aim to debias text by suggesting edits that would make it more neutral. For example, a news headline like “John McCain exposed as an unprincipled politician” is seen as biased because the verb ‘expose’ presupposes the subjective opinion of the writer. A debiased sentence would use a verb like ‘describe’.

tf-12.4.png

The study required careful data cleaning and filtering to ensure a high density of debiasing edits. But some noise in the data is inescapable. Pryzant said, for example, since the data was drawn from NPOV edits, this also sometimes included lengthy exchanges produced by Wikipedia “edit wars,” where two editors go back and forth on content, repeatedly tagging each other’s edits for NPOV violations. Improving data cleaning could be an area for future research.

The researchers also introduced the Wiki Neutrality Corpus (WNC) as the first parallel corpus of biased language. The WNC contains 180,000 biased and neutralized sentence pairs along with contextual sentences and metadata harvested from NPOV Wikipedia edits.

tf-12.4-.png
Samples from the new Wiki Neutrality Corpus.

The study proposes a BERT-based Concurrent encoder-decoder that identifies subjective words as part of the generation process, and a Modular system which uses a BERT-based detector for identifying subjective words and a novel join embedding for editing.

The evaluation process considered five metrics. The researchers also had English-speaking Amazon Mechanical Turk crowdworkers compare pairs of original and edited sentences for fluency, meaning preservation, and bias. The proposed model was able to identify and reduce bias in encyclopedias, news, books, and political speeches better than SOTA style transfer and machine translation systems.

Pryzant said he hopes the study will advance research towards an efficient system for automatic reduction of bias. Even if there are bumps on the road “it means we’ve gotten people to think critically about bias in AI.”

The study so far only looks at subjective biases in English and proposes single-word edits. “Single-word edits are in practice the simplest incarnations of bias — lots of adjectives that frame an idea or modify the likelihood of a proposition,” explained Pryzant. “Moving up to multi-word and cross-sentence bias is something I hope this paper will get people excited about.”

But those new goals also bring new challenges. Unlike single-word edits, where just knowing the information in the sentence itself is often enough to make an accurate prediction of bias, multi-word and cross-sentence processing requires a much wider knowledge context to determine what is and isn’t biased.

The paper Automatically Neutralizing Subjective Bias in Text is on arXiv and will appear at next year’s AAAI Conference on Artificial Intelligence in New York.


Journalist: Yuan Yuan | Editor: Michael Sarazen

1 comment on “Stanford, Kyoto & Georgia Tech Model ‘Neutralizes’ Biased Language

  1. Pingback: December 2019 Newsletter | Conscious Style Guide

Leave a Reply

Your email address will not be published. Required fields are marked *

%d