Tag: Deep Learning

AI Machine Learning & Data Science Research

Deeper Is Not Necessarily Better: Princeton U & Intel’s 12-Layer Parallel Networks Achieve Performance Competitive With SOTA Deep Networks

In the new paper Non-deep Networks, a research team from Princeton University and Intel Labs argues it is possible to achieve high performance with “non-deep” neural networks, presenting ParNet (Parallel Networks), a novel 12-layer architecture that achieves performance competitive with its state-of-the-art deep counterparts.

AI Machine Learning & Data Science Research

100+ Stanford Researchers Publish 200+ Page Paper on the AI Paradigm Shift Introduced by Large-Scale Models

In a 200+ page paper, Percy Liang, Fei-Fei Li, and over 100 other researchers from the Stanford University Center for Research on Foundation Models (CRFM) systematically describe the opportunities and risks of large-scale pretrained “foundation” models. The unique study aims to provide a clearer understanding of how these models work, when and how they fail, and the various capabilities provided by their emergent properties.

AI Machine Learning & Data Science Research

Logic Explained Deep Neural Networks: A General Approach to Explainable AI

A research team from Università di Firenze, Università di Siena, University of Cambridge and Universitè Côte d’Azur proposes a general approach to explainable artificial intelligence (XAI) in neural architectures, designing interpretable deep learning models called Logic Explained Networks (LENs). The novel approach yields better performance than established white-box models while providing more compact and meaningful explanations.

AI Machine Learning & Data Science Research

DeepMind’s Perceiver IO: A General Architecture for a Wide Variety of Inputs & Outputs

A DeepMind research team proposes Perceiver IO, a single network that can easily integrate and transform arbitrary information for arbitrary tasks while scaling linearly with both input and output sizes. The general architecture achieves outstanding results on tasks with highly structured output spaces, such as natural language and visual understanding.

AI Machine Learning & Data Science Popular Research

ETH Zürich Identifies Priors That Boost Bayesian Deep Learning Models

A research team from ETH Zürich presents an overview of priors for (deep) Gaussian processes, variational autoencoders and Bayesian neural networks. The researchers propose that well-chosen priors can achieve theoretical and empirical properties such as uncertainty estimation, model selection and optimal decision support; and provide guidance on how to choose them.

AI Machine Learning & Data Science Popular Research

Bronstein, Bruna, Cohen and Velickovic Leverage the Erlangen Programme to Establish the Geometric Foundations of Deep Learning

Twitter Chief Scientist Michael Bronstein, Joan Bruna from New York University, Taco Cohen from Qualcomm AI and Petar Veličković from DeepMind publish a paper that aims to geometrically unify the typical architectures of CNNs, GNNs, LSTMs, Transformers, etc. from the perspective of symmetry and invariance to build an “Erlangen Programme” for deep neural networks.

AI Machine Learning & Data Science Research

CMU, UT Austin & Facebook’s CNN Layer Width Optimization Strategies Achieve 320x Overhead Reduction

Researchers from Carnegie Mellon University, the University of Texas at Austin and Facebook AI propose a novel paradigm to optimize widths for each CNN layer. The method is compatible across various width optimization algorithms and networks and achieves up to a 320x reduction in width optimization overhead without compromising top-1 accuracy on ImageNet.

AI Machine Learning & Data Science Research

TUM, Google, Nvidia & LMU München’s CodeTrans Pretrained Models Crack Source Code Tasks With SOTA Performance

A research team from Technical University of Munich, Google, Nvidia and LMU München proposes CodeTrans, an encoder-decoder transformer model which achieves state-of-the-art performance on six tasks in the software engineering domain, including Code Documentation Generation, Source Code Summarization, Code Comment Generation, etc.