Efficient GMRES+AMG on GPUs: Composite smoothers and mixed V-cycles

Thomas, S., Baker, A.. (2024). Efficient GMRES+AMG on GPUs: Composite smoothers and mixed V-cycles. Siam Journal on Scientific Computing, doi:https://doi.org/10.1137/23M1578632

Title Efficient GMRES+AMG on GPUs: Composite smoothers and mixed V-cycles
Genre Article
Author(s) S. Thomas, Allison Baker
Abstract In this study, we introduce algorithms optimized for GPU architectures, aimed at efficiently solving large sparse linear systems, a central challenge in Navier–Stokes pressure projection problems. Our approach includes an adaptation of the GMRES algorithm, drawing inspiration from the merged vector operations first proposed by Bielich et al. [Parallel Comput., 112 (2022), 102940]. This adaptation increases computational intensity on GPU platforms through optimized vector update strategies. The algorithm incorporates modified and classical Gram–Schmidt methods with an algebraic multigrid (AMG) preconditioner, each tailored for GPU performance. A key innovation in our work is the development of a Gram–Schmidt projector Pk employing a rank-1 perturbation of the identity matrix. Designed to maximize the high memory bandwidth utilization of the AMD MI-250X GPU, this approach includes a strategy for treating the unit diagonal that minimizes memory reads, leading to a 25% increase in computational efficiency. The application of perturbation theory further ensures that orthogonality loss is limited to O(ϵ)k, where k is the number of iterations. Additionally, we introduce a mixed AMG V-cycle strategy combining ILU(0) and ℓ1-Jacobi smoothers, which achieves a 30–50% reduction in GPU compute times compared to conventional methods, while maintaining low backward error. This strategy, alongside our novel treatment of the diagonal in triangular matrices, marks a substantial increase in AMG efficicency for GPU systems. We believe that these contributions represent a significant advance in optimizing GMRES+AMG algorithms for GPU computations. The empirical results demonstrate notable speed increments and maintain rigorous backward error bounds, underscoring the potential of our methods to substantially increase computational efficiency in large-scale scientific applications.
Publication Title Siam Journal on Scientific Computing
Publication Date Oct 1, 2024
Publisher's Version of Record https://doi.org/10.1137/23M1578632
OpenSky Citable URL
OpenSky Listing View on OpenSky
CISL Affiliations TDD, ASAP

< Back to our listing of publications.