AI Community Data Intelligence Machine Learning & Data Science

Viral Post Highlights ‘Toxicity Problems’ in the Machine Learning Community

The post highlights perceived peer-review problems, the reproducibility crisis, and ethics and diversity issues.

A Reddit post identifying eight “toxicity problems” in the machine learning (ML) community recently went viral, receiving some 3,300 upvotes and nearly 600 comments in a week.

The post highlights perceived peer-review problems, the reproducibility crisis, and ethics and diversity issues. It arguing that the peer-review process is “broken” and that there is a “worshiping problem” and “a cut-throat publish-or-perish mentality” in the paper publishing process and beyond.

Over 60 percent of published theoretical computer science and machine learning papers are on arXiv, according to a 2017 study. Indeed, 56 percent of papers published in 2017 appeared on arXiv (along with the authors’ names and institutions) before or during peer review. The Reddit post says this can negatively affect the double-blind peer review process, as reviewers could be more inclined to accept papers whose authors are from renowned institutions.

Earlier this year, Synced also looked at some possible ways to improve the paper review process in the ML community. Since 1998, the volume of AI papers in peer-reviewed journals has grown by more than 300 percent, according to the AI Index 2019 Report. At the same time, major AI conferences like NeurIPS, AAAI, and CVPR are setting new paper submission records every year. All this has led to complaints of long delays, inconsistent standards, and unqualified reviewers in the peer review process.

In his blog, Turing awardee Yoshua Bengio also urged the community to rethink the overall publication process and proposed a potentially different publication model for ML — where papers are first submitted to a fast turnaround journal, and then conference program committees select the papers they like from the list of accepted and reviewed (scored) papers.

The Reddit post also references some of the most heated community discussions over the past few months and raises questions about diversity and inclusivity issues in machine learning and computer science in general. Synced explored the gender imbalance in ML in a special Women in AI project in March and found that only 18 percent of authors at the leading 21 AI conferences are women, according to the 2019 Global AI Talent Report. The 2019 AI Index also reported that across the educational institutions examined, males made up 80 percent of AI professors on average.

The ongoing discussion of racial biases in AI reached a dramatic climax earlier this month when Facebook Chief AI Scientist and Turing Award Winner Yann LeCun announced his exit from Twitter after getting involved in a heated dispute on the topic on the platform. The dispute started with the new Duke University PULSE AI photo recreation model depixelating a low-resolution input image of Barack Obama into a photo of a white male.

The dispute saw a week-long back-and-forth between LeCun and Google Ethical Artificial Intelligence Team Technical Co-lead Timnit Gebru, who suggested LeCun’s comments becoming the story reflected “a pattern of marginalization.”

The Reddit post argues that although LeCun’s comments on biases and fairness may have been “insensitive,” the backlash he received was excessive and that “reducing every negative comment in a scientific discussion to race and gender creates a toxic environment.”

Google AI lead Jeff Dean also tweeted a long thread this week saying the community “has a problem with inclusiveness.” AI is full of promise with the potential to revolutionize so many different areas of modern society, he said, and in order to realize its true potential, it needs to be welcoming to all people. “As it stands today, it is definitely not.”

Dean warned the potential consequences of a lack of diversity in AI and computer science could include critical issues that affect different communities being ignored, downplayed, or not even considered — rather than receiving the serious attention they deserve. To improve the field in this regard, Dean urged people to call out bad behaviour and to actively support and uplift diverse voices.

Echoing the Reddit post’s concern that discussions in ML community have become increasingly disrespectful, Dean tweeted “Let’s not demean, discourage, or attack. Instead, let’s see more of the encouragement, mentoring, and welcoming outreach that our field so desperately needs.”


Journalist: Yuan Yuan | Editor: Michael Sarazen


Image for post

Synced Report | A Survey of China’s Artificial Intelligence Solutions in Response to the COVID-19 Pandemic — 87 Case Studies from 700+ AI Vendors

This report offers a look at how the Chinese government and business owners have leveraged artificial intelligence technologies in the battle against COVID-19. It is also available on Amazon Kindle.

Click here to find more reports from us.


We know you don’t want to miss any story. Subscribe to our popular Synced Global AI Weekly to get weekly AI updates.

Image for post

5 comments on “Viral Post Highlights ‘Toxicity Problems’ in the Machine Learning Community

%d bloggers like this: