Minimal aggregated shared memory messaging on distributed memory supercomputers

Jamroz, B. F., Dennis, J. M.. (2016). Minimal aggregated shared memory messaging on distributed memory supercomputers.

Title Minimal aggregated shared memory messaging on distributed memory supercomputers
Genre Conference Material
Author(s) Benjamin F. Jamroz, John M. Dennis
Abstract Many high-performance distributed memory applications rely on point-to-point messaging using the Message Passing Interface (MPI). Due to the latency of the network, and other costs, this communication can limit the scalability of an application when run on high node counts of distributed memory supercomputers. Communication costs are further increased on modern multi-and many-core architectures, when using more than one MPI process per node, as each process sends and receives messages independently, inducing multiple latencies and contention for resources. In this paper, we use shared memory constructs available in the MPI 3.0 standard to implement an aggregated communication method to minimize the number of inter-node messages to reduce these costs. We compare the performance of this Minimal Aggregated SHared Memory (MASHM) messaging to the standard point-to-point implementation on large-scale supercomputers, where we see that MASHM leads to enhanced strong scalability of a weighted Jacobi relaxation. For this application, we also see that the use of shared memory parallelism through MASHM and MPI 3.0 can be more efficient than using Open Multi-Processing (OpenMP). We then present a model for the communication costs of MASHM which shows that this method achieves its goal of reducing latency costs while also reducing bandwidth costs. Finally, we present MASHM as an open source library to facilitate the integration of this efficient communication method into existing distributed memory applications.
Publication Title
Publication Date May 23, 2016
Publisher's Version of Record
OpenSky Citable URL https://n2t.org/ark:/85065/d7nc62wk
OpenSky Listing View on OpenSky
CISL Affiliations TDD, ASAP

Back to our listing of publications.