Logic synthesis (LS) is an approach for finding equivalent representations of large-scale integrated circuits, generally via pre-mapping optimizations, technology mapping and post-mapping optimizations. LS is a sequential decision-making problem that can be formulated as a Markov decision process (MDP) and tackled by reinforcement learning (RL) algorithms.
In the new paper Rethinking Reinforcement Learning based Logic Synthesis, a research team from Huawei Noah’s Ark Lab explores current RL-based LS methods and finds that the learned policy of the RL algorithms is state-agnostic and yields an operator sequence that is somewhat permutation invariant. The team proposes a novel RL-based method that can automatically recognize critical operators and produce common operator sequences that are generalizable to unseen circuits.
The primary objective for RL-based LS is learning a control policy that determines which operator to use in different states — where the states are feature vectors of the current And-Inverter-Graph (AIG) circuit representation. After extensive experiments on the RL-based logic synthesis approach, the team found that: 1) decisions made by the RL policy do not depend on circuit features, and 2) the permutation of these operators has little effect on final performance. The researchers thus conclude that extracting circuit features is unnecessary; and that although LS has an exponentially growing search space, the loss surface remains flat since the permutation of operators in the same sequence has a negligible performance impact.
Based on these insights, the team designs a novel method to automatically recognize critical operators and generate a common sequence for different circuits. The approach comprises two steps: It first learns a shared policy for a number of circuits, then searches for a best-performing common sequence based on the learned policy. The common sequence can then be used to optimize unseen circuits directly without extra online learning or adjustments, thus saving time.
To validate the effectiveness of their approach, the team evaluated its runtime efficiency and generalization ability to unseen circuits on the EPFL benchmark.
The results show that the proposed approach is able to find a common sequence, achieves good performance at delay deduction, and significantly reduces runtime. Its superior trade-offs between delay, area, and runtime also give it a practical advantage for industrial applications.
The paper Rethinking Reinforcement Learning Based Logic Synthesis is on arXiv.
Author: Hecate He | Editor: Michael Sarazen
We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.