AI Machine Learning & Data Science Research

Gemini: Bridging Tomorrow’s Deep Neural Network Frontiers with Unrivaled Chiplet Accelerator Mastery

A research team introduces Gemini, an innovative framework, focusing on both architecture and mapping co-exploration, aims to propel large-scale DNN chiplet accelerators to new heights, achieving an impressive average performance improvement of 1.98× and an energy efficiency boost of 1.41× compared to the state-of-the-art Simba architecture.

In the realm of tackling progressively intricate challenges, Deep Neural Networks (DNNs) have witnessed a rapid expansion in size and complexity, leading to heightened demands on computing power and storage. In response to this, chiplet technology, in contrast to monolithic chips, emerges as a compelling solution, presenting opportunities to enhance performance, reduce power consumption, and augment design flexibility in deploying Deep Neural Networks (DNNs).

Despite its promises, chiplet technology introduces challenges such as elevated packaging costs and costly Die-to-Die (D2D) interfaces. Consequently, optimizing the advantages and mitigating the drawbacks of chiplet technology becomes imperative for the development of large-scale DNN chiplet accelerators.

In a new paper Gemini: Mapping and Architecture Co-exploration for Large-scale DNN Chiplet Accelerators, a research team from Tsinghua University, Xi’an Jiaotong University, IIISCT and Shanghai AI Laboratory proposes Gemini, an innovative framework, focusing on both architecture and mapping co-exploration, aims to propel large-scale DNN chiplet accelerators to new heights, achieving an impressive average performance improvement of 1.98× and an energy efficiency boost of 1.41× compared to the state-of-the-art Simba architecture.

The researchers identify two primary challenges inherent in chiplet technology. On the architectural front, the pivotal challenge lies in determining the optimal chiplet granularity, necessitating a delicate balance between employing numerous smaller chiplets for improved yield and opting for fewer larger chiplets to curtail costs. In the realm of DNN mapping, the challenges stem from the expansive scale enabled by chiplet technology and the associated costly D2D links.

To tackle these challenges, the team introduces a layer-centric encoding method for representing LP SPM schemes in many-core chiplet DNN inference accelerators. This encoding method delineates the optimization space for LP mapping, revealing significant opportunities for improvement. Leveraging this encoding and a highly configurable hardware template, Gemini is formulated as a mapping and architecture co-exploration framework for large-scale DNN chiplet accelerators, featuring the Mapping Engine and the Monetary Cost Evaluator.

The Mapping Engine employs a Simulated Annealing (SA) algorithm with five specifically-designed operators to navigate the extensive space defined by the encoding method, automatically minimizing costly D2D communication. Simultaneously, the Monetary Cost Evaluator assesses the monetary cost of accelerators with varying architectural parameters.

In their empirical study, the team compares Gemini’s co-optimized architecture and mapping with the Simba architecture using Tangram SPM. The results are remarkable, with Gemini achieving an average performance improvement of 1.98× and a 1.41× energy efficiency enhancement across various DNNs and batch sizes, accompanied by a mere 14.3% increase in monetary cost.

The significance of this work is underscored by being the first to systematically define the optimization space of LP SPM for DNN inference accelerators. Gemini stands out as the pioneering framework to jointly explore the optimization space of mapping and architecture for large-scale DNN chiplet accelerators, taking into account energy consumption, performance, and monetary cost.

The team concludes by emphasizing that, with proper design considerations facilitated by Gemini, the concept of employing a single chiplet for multiple accelerators can be effectively applied to DNN inference accelerators, opening new avenues for innovation and efficiency in this rapidly evolving field.

The paper Gemini: Mapping and Architecture Co-exploration for Large-scale DNN Chiplet Accelerators on arXiv.


Author: Hecate He | Editor: Chain Zhang


We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

0 comments on “Gemini: Bridging Tomorrow’s Deep Neural Network Frontiers with Unrivaled Chiplet Accelerator Mastery

Leave a Reply

Your email address will not be published. Required fields are marked *