Minimal aggregated shared memory messaging on distributed memory supercomputers
Jamroz, B. F., Dennis, J. M.. (2016). Minimal aggregated shared memory messaging on distributed memory supercomputers.
Title | Minimal aggregated shared memory messaging on distributed memory supercomputers |
---|---|
Genre | Conference Material |
Author(s) | Benjamin F. Jamroz, John M. Dennis |
Abstract | Many high-performance distributed memory applications rely on point-to-point messaging using the Message Passing Interface (MPI). Due to the latency of the network, and other costs, this communication can limit the scalability of an application when run on high node counts of distributed memory supercomputers. Communication costs are further increased on modern multi-and many-core architectures, when using more than one MPI process per node, as each process sends and receives messages independently, inducing multiple latencies and contention for resources. In this paper, we use shared memory constructs available in the MPI 3.0 standard to implement an aggregated communication method to minimize the number of inter-node messages to reduce these costs. We compare the performance of this Minimal Aggregated SHared Memory (MASHM) messaging to the standard point-to-point implementation on large-scale supercomputers, where we see that MASHM leads to enhanced strong scalability of a weighted Jacobi relaxation. For this application, we also see that the use of shared memory parallelism through MASHM and MPI 3.0 can be more efficient than using Open Multi-Processing (OpenMP). We then present a model for the communication costs of MASHM which shows that this method achieves its goal of reducing latency costs while also reducing bandwidth costs. Finally, we present MASHM as an open source library to facilitate the integration of this efficient communication method into existing distributed memory applications. |
Publication Title | |
Publication Date | May 23, 2016 |
Publisher's Version of Record | |
OpenSky Citable URL | https://n2t.org/ark:/85065/d7nc62wk |
OpenSky Listing | View on OpenSky |
CISL Affiliations | TDD, ASAP |