AI

Can AI Judge a Paper on Appearance Alone?

The number of AI-related research papers has skyrocketed in recent years, outpacing papers from all other academic topics since 2000. This has, not unsurprisingly, resulted in a shortage of qualified peer reviewers in the machine learning community, particularly when it comes to conference paper submissions.

The number of AI-related research papers has skyrocketed in recent years, outpacing papers from all other academic topics since 2000. This has, not unsurprisingly, resulted in a shortage of qualified peer reviewers in the machine learning community, particularly when it comes to conference paper submissions. Conference organizers are attempting to expand the supply, but it can take years of academic study in the field of AI to qualify a person as a peer reviewer.
屏幕快照 2018-12-24 上午9.33.03.png Virginia Tech Associate Professor Jia-Bin Huang serves as an Area Chair for prestigious AI conferences CVPR 2019 and ICCV 2019. To reduce the workload of peer reviewers such as himself, Huang recently published research on arXiv which uses deep learning techniques to predict whether a paper should be accepted or not based solely on its visual appearance. The model checks features such as layout, detailed table of results, and percentage of allotted space to make its determinations.

Huang’s Deep Paper Gestalt presents a promising experimental result: the classifier safely rejected 50 percent of the bad papers it checked; while wrongly rejecting only 0.4 percent of the good papers.

Huang told Synced via email “the idea of training a classifier to recognize good/bad papers has been around since 2010 [Paper Gestalt, a 2010 paper by Carven von Bearnensquash from the University of Phoenix]. Since early December, I thought it might be good to revisit the problem with more modern tools. The goal is to provide some insights on what a good paper looks like and how we can improve our own work.”

Huang first created a new dataset, Computer Vision Paper Gestalt, comprising both positive examples — the list of accepted papers in six CVPR and three ICCV proceedings from 2013 to 2018 — and negative examples such as workshop papers. He removed the headers atop the first page to protect against potential data leakage and to make the model focus on the the visual contents of the body of the paper.

The next step was to train the classifier: “We used ResNet-18 (pre-trained on ImageNet) as our classification network, and replaced the ImageNet 1,000 class classification head with two output nodes (good or bad papers). Following the practice of transfer learning, we fine-tuned the ImageNet pre-trained network on the proposed CVPG dataset with stochastic gradient descent with a momentum of 0.9 for a total of 50 epochs.”

The trained model achieved 92 percent accuracy on the test dataset of CVPR 2018 conference/workshop papers. Huang notes that a funny thing happened when he tested his own paper: the model rejected it.

To discover visual appearance patterns specific to good papers, Huang used Generative Adversarial Networks (GANs) to forge “good papers,” which featured “illustrative figures upfront”, “colorful images”, and “a balanced layout of texts/math/tables/plots.” He also trained a GAN model to translate the visual appearance of bad papers into those of good papers. Suggestions include “adding teaser figure upfront”, “making the figures more colorful”, and “filling up the last page.”