Global Machine Learning & Data Science Research

NeurIPS 2020 | Conference Watch on Self-Supervised Learning

This year, NeurIPS is hosting two workshops dedicated to self-supervised learning: Self-Supervised Learning for Speech and Audio Processing on Friday, December 11; and Self-Supervised Learning — Theory and Practice on Saturday, December 12.

Back in February, when AI conferences were still held in-person, Turing Award winners Geoffrey Hinton, Yann LeCun and Yoshua Bengio shared a stage in New York at an AAAI event, which Synced covered in detail. LeCun told the audience that, after decades of skepticism, he had finally joined Hinton in support of the idea that self-supervised learning may usher in AI’s next revolution.

Unlike supervised learning, which requires manual data-labelling, self-supervised learning (SSL) is an approach that can automatically generate labels. Recent improvements in self-supervised training methods have established SSL as a serious alternative to traditional supervised training. Google’s language representation model ALBERT for example utilizes a self-supervised training framework to leverage large amounts of text.

It’s no surprise then that NeurIPS 2020 (the Conference on Neural Information Processing Systems) would find itself at the forefront of this trend. First proposed in 1986, the annual machine learning and computational neuroscience conference has evolved into one of the world’s leading AI gatherings.

This year, NeurIPS is hosting two workshops dedicated to self-supervised learning: Self-Supervised Learning for Speech and Audio Processingfrom 6:50 am to 4:25 pm PT (2:50 pm to 12:25 am UTC) on Friday, December 11; andSelf-Supervised Learning — Theory and Practicefrom 8:50 am to 6:40 pm PT (4:50 pm to 2:40 am UTC) on Saturday, December 12.

Workshop organizers say the machine learning community is keen to adopt self-supervised approaches to pre-train deep networks as this makes it possible to use the tremendous amount of unlabelled data available on the Internet to train large networks and solve complicated tasks.

The main active SSL research direction is in speech and audio processing, particularly automatic speech recognition, speaker identification and speech translation. Challenges in the field include the modelling of diverse speech and languages and improving audio processing. Also, most existing SSL research has been driven to improve empirical performance, proceeding at speed but without a strong theoretical foundation. NeurIPS 2020 is offering these workshops to open and encourage discussion on such unexplored territories in SSL research.

LeCun will give a talk in the Self-Supervised Learning — Theory and Practice workshop, which will feature SSL-interested researchers from various domains, including Google Brain Research Scientists Quoc V. Le and Chelsea Finn. The workshop will explore the theoretical foundations of empirically well-performing SSL approaches, and how the theoretical insights can further improve SSL’s empirical performance.

Finn is also scheduled for a talk at the Self-Supervised Learning for Speech and Audio Processing workshop, where he will be joined by Dong Yu from Tencent, Mirco Ravanelli from MILA, among other speakers.


The keyword “self-supervised” appears in the titles of 27 accepted papers at NeurIPS 2020, across topics such as visual representation, image denoising, relationship probing, speech representation, cross-modal audio-video clustering, etc.

Dots represent papers arranged by a measure of similarity. Blue dots are papers with “self-supervised” in their titles.

Most SSL papers focus on text, audio, or visual representations: Facebook AI’s wav2vec 2.0 paper proposes a framework for SSL of speech representations, while DeepMind’s BYOL (Bootstrap Your Own Latent) paper introduces a new approach to self-supervised image representation learning.

Two papers shed light on SSL for 3D applications. Researchers from Google and Saarland University in Germany present an end-to-end SSL framework for fitting 3D human models to 3D scans of dressed humans, and researchers from the Potsdam University in Germany propose 3D versions for five different self-supervised methods and demonstrate the effectiveness of their methods on three downstream tasks in the medical imaging domain.

Researchers from the Korea Advanced Institute of Science and Technology in their paper propose replacing LiDAR with self-supervised depth estimators. New York University researchers explore SSL through the eyes of a child!

Over 18,000 people around the globe are participating in the NeurIPS 2020 virtual gathering. Organizers have increased the number of oral presentations and added live Q&A sessions with participation from the oral and spotlight presenters.

The online-only format for such conferences has steadily improved over the last year as bugs and bottlenecks have been worked out and new technologies introduced. To better accommodate attendees in different time zones and with varied Internet speed and access, NeurIPS 2020 organizers designed a schedule with two six-hour sessions per day: the first starts at 5 am PT (1 pm UTC) and the second at 5 pm PT (1 am UTC). Paper authors can choose either session to make their presentations. The organizers have also enabled users to choose their preferred bandwidth.

Instead of dedicating a single Zoom room for each poster, organizers opted to hold virtual poster sessions in a common space called Gather Town. This re-thinking emerged following positive feedback from ICLR attendees and the joint affinity groups poster session at ICML, the organizers explain in a blog post. Gather Town is a video-calling space that lets multiple people hold separate conversations in parallel and walk in and out of those conversations as easily as in real life.


It’s hoped this arrangement can help simulate the physical experience of conferences of yesteryear, enabling people to walk around as little video-game avatars, bumping into each other and dropping in and out of poster sessions and conversations.

NeurIPS 2020 is also hosting Town Hall meeting today — one at 4 am PT and one at 4 pm PT (12 am and 12 pm UTC) — giving the community a great opportunity to provide feedback on the changes and discuss how this and future AI conferences might continue to favourably evolve their virtual environments.

Reporter: Yuan Yuan | Editor: Michael Sarazen


Synced Report | A Survey of China’s Artificial Intelligence Solutions in Response to the COVID-19 Pandemic — 87 Case Studies from 700+ AI Vendors

This report offers a look at how China has leveraged artificial intelligence technologies in the battle against COVID-19. It is also available on Amazon KindleAlong with this report, we also introduced a database covering additional 1428 artificial intelligence solutions from 12 pandemic scenarios.

Click here to find more reports from us.

AI Weekly.png

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

2 comments on “NeurIPS 2020 | Conference Watch on Self-Supervised Learning

  1. Pingback: [N] NeurIPS 2020 | Conference Watch on Self-Supervised Learning –

  2. Pingback: [N] NeurIPS 2020 | Conference Watch on Self-Supervised Learning – ONEO AI

Leave a Reply

Your email address will not be published.

%d bloggers like this: