DR-CircuitGNN爆誕！回路設計を爆速化🚀

Published：2025/8/22 20:05:38

DR-CircuitGNN爆誕！回路設計を爆速化🚀

超要約: 回路設計を爆速にするGNN✨ GPU使ってHGNNを爆速化！

✨ ギャル的キラキラポイント ✨

● 回路設計（かいろせっけい）をグラフで表現（ひょうげん）するGNNを高速化する研究なんだって！賢すぎ💖

● 特に、回路を詳しく表現できるHGNN（ヘテロジーニアスGNN）の学習をGPUで爆速化する「DR-CircuitGNN」を開発したんだって！すごーい😳

続きは「らくらく論文」アプリで

DR-CircuitGNN: Training Acceleration of Heterogeneous Circuit Graph Neural Network on GPUs

Yuebo Luo / Shiyang Li / Junran Tao / Kiran Thorat / Xi Xie / Hongwu Peng / Nuo Xu / Caiwen Ding / Shaoyi Huang

The increasing scale and complexity of integrated circuit design have led to increased challenges in Electronic Design Automation (EDA). Graph Neural Networks (GNNs) have emerged as a promising approach to assist EDA design as circuits can be naturally represented as graphs. While GNNs offer a foundation for circuit analysis, they often fail to capture the full complexity of EDA designs. Heterogeneous Graph Neural Networks (HGNNs) can better interpret EDA circuit graphs as they capture both topological relationships and geometric features. However, the improved representation capability comes at the cost of even higher computational complexity and processing cost due to their serial module-wise message-passing scheme, creating a significant performance bottleneck. In this paper, we propose DR-CircuitGNN, a fast GPU kernel design by leveraging row-wise sparsity-aware Dynamic-ReLU and optimizing SpMM kernels during heterogeneous message-passing to accelerate HGNNs training on EDA-related circuit graph datasets. To further enhance performance, we propose a parallel optimization strategy that maximizes CPU-GPU concurrency by concurrently processing independent subgraphs using multi-threaded CPU initialization and GPU kernel execution via multiple cudaStreams. Our experiments show that on three representative CircuitNet designs (small, medium, large), the proposed method can achieve up to 3.51x and 4.09x speedup compared to the SOTA for forward and backward propagation, respectively. On full-size CircuitNet and sampled Mini-CircuitNet, our parallel design enables up to 2.71x speed up over the official DGL implementation cuSPARSE with negligible impact on correlation scores and error rates.

cs / cs.LG

Arxivで見る