Sequence Model

by Synced 2023-05-19 3

Meta AI’s MegaByte Scalable Architecture for Long Sequence Modelling Outperforms Existing Byte-Level Models

In the new paper MegaByte: Predicting Million-Byte Sequences with Multiscale Transformers, a Meta AI research team presents MegaByte, a multiscale decoder architecture that enables million-byte sequence modelling.

by Synced 2021-10-29 1

AI Computer Vision & Graphics Machine Learning & Data Science Research

DeepMind Study Resolves Delusions in Sequence Models for Interaction and Control

In the new paper Shaking the Foundations: Delusions in Sequence Models for Interaction and Control, a DeepMind research team explores the origins of mismatch problems in sequence models that lack understanding of the cause and effect of their actions, and addresses the problem by treating actions as causal interventions.

by Synced 2021-06-09 2

AI Machine Learning & Data Science Research

Pieter Abbeel Team’s Decision Transformer Abstracts RL as Sequence Modelling

A research team from UC Berkeley, Facebook AI Research and Google Brain abstracts Reinforcement Learning (RL) as a sequence modelling problem. Their proposed Decision Transformer simply outputs optimal actions by leveraging a causally masked transformer, yet matches or exceeds state-of-the-art model-free offline RL baselines on Atari, OpenAI Gym, and Key-to-Door tasks.