タイトル & 超要約:GPUでBTridiag爆速化!計算、秒速で終わらせるよ♡
I. 研究の概要
研究の目的
GPUを使って、ブロック三重対角線形システム(BTridiagシステム)を爆速で計算する「BlockDSS」っていう新しい方法を開発したよ!色んな分野で計算時間短縮を目指す💪
続きは「らくらく論文」アプリで
Block-tridiagonal systems are prevalent in state estimation and optimal control, and solving these systems is often the computational bottleneck. Improving the underlying solvers therefore has a direct impact on the real-time performance of estimators and controllers. We present a GPU-based implementation for the factorization and solution of block-tridiagonal symmetric positive definite (SPD) linear systems. Our method employs a recursive Schur-complement reduction, transforming the original system into a hierarchy of smaller, independent systems that can be solved in parallel using batched BLAS/LAPACK routines. Performance benchmarks with our cross-platform (NVIDIA and AMD) implementation, BlockDSS, show substantial speed-ups over state-of-the-art CPU direct solvers, including CHOLMOD and HSL MA57, while remaining competitive with NVIDIA cuDSS. At the same time, the current implementation still invokes batched routines sequentially at each recursion level, and high efficiency requires block sizes large enough to amortize kernel launch overhead.