• Home
  • History
  • Annotate
Name Date Size #Lines LOC

..09-Apr-2021-

communicator/H09-Apr-2021-7349

config/H09-Apr-2021-5642

detail/H09-Apr-2021-1,8381,373

doc/H09-Apr-2021-415305

examples/H09-Apr-2021-1,3761,035

exercise/H09-Apr-2021-306233

experimental/H09-Apr-2021-7256

fake/H03-May-2022-4,3793,336

presentations/H03-May-2022-

rma/H09-Apr-2021-347266

serialization_hack/H09-Apr-2021-1,7731,271

shm/H09-Apr-2021-1,195901

stl/H09-Apr-2021-137110

test/H03-May-2022-10,3279,386

FILE.hppH A D09-Apr-20216.5 KiB146130

LICENSEH A D09-Apr-20211.3 KiB2420

README.mdH A D09-Apr-202116.3 KiB415305

address.hppH A D09-Apr-20211.2 KiB5434

allocator.hppH A D09-Apr-20213.2 KiB125100

buffer.hppH A D09-Apr-20212.7 KiB11286

cartesian_communicator.hppH A D09-Apr-202110.5 KiB322285

communication_mode.hppH A D09-Apr-20213 KiB8775

communicator.hppH A D09-Apr-2021124.6 KiB3,5092,976

communicator_fwd.hppH A D09-Apr-2021514 2515

communicator_iterator.hppH A D09-Apr-20211.5 KiB5639

core.hppH A D09-Apr-2021985 4631

dynamic_window.hppH A D09-Apr-20211.7 KiB7045

environment.hppH A D09-Apr-20217.9 KiB223172

error.hppH A D09-Apr-20212.7 KiB10178

error_handler.hppH A D09-Apr-20212.7 KiB9457

exception.hppH A D09-Apr-2021491 2915

future.hppH A D09-Apr-20211 KiB5037

generalized_request.hppH A D09-Apr-20213.7 KiB159129

graph_communicator.hppH A D09-Apr-20213.7 KiB14165

group.hppH A D09-Apr-20214 KiB12594

handle.hppH A D09-Apr-20215.1 KiB154135

info.hppH A D09-Apr-20212.9 KiB9175

main.hppH A D09-Apr-20211.9 KiB8259

main_environment.hppH A D09-Apr-20211.2 KiB5235

match.hppH A D09-Apr-2021786 4028

message.hppH A D09-Apr-20211.5 KiB6653

multi_array.hppH A D09-Apr-2021133 107

mutex.hppH A D09-Apr-20219.4 KiB380316

operation.hppH A D09-Apr-20214.9 KiB158112

ostream.hppH A D09-Apr-20212.8 KiB10373

package_archive.hppH A D09-Apr-202110.4 KiB353296

pointer.hppH A D09-Apr-20214.9 KiB206147

port.hppH A D09-Apr-20212.3 KiB8775

process.hppH A D09-Apr-20215.2 KiB213167

process.xmlH A D09-Apr-20211.1 KiB4644

request.hppH A D09-Apr-20218.4 KiB271223

shared_communicator.hppH A D09-Apr-20215.5 KiB157128

shared_main.hppH A D09-Apr-2021970 4027

shared_mutex.hppH A D09-Apr-20213.5 KiB12591

shared_window.hppH A D09-Apr-20213 KiB10478

status.hppH A D09-Apr-20212.9 KiB10987

type.hppH A D09-Apr-202111.8 KiB372295

types.hppH A D09-Apr-2021240 148

vector.hppH A D09-Apr-20213.6 KiB12460

version.hppH A D09-Apr-20213.2 KiB10886

wall_clock.hppH A D09-Apr-20212.5 KiB9067

window.hppH A D09-Apr-202111 KiB319253

README.md

1<!--- &2>/dev/null
2    xdg-open $0; exit
3--->
4[comment]: # (Comment)
5
6# [Boost].MPI3
7*Alfredo A. Correa*
8<alfredo.correa@gmail.com>
9
10[Boost].MPI3 is a C++ library wrapper for standard MPI3.
11
12[Boost].MPI3 is not an official Boost library.
13However Boost.MPI3 is designed following the principles of Boost and the STL.
14
15[Boost].MPI3 is not a derivative of Boost.MPI and it is unrelated to the, now deprecated, official MPI-C++ interface.
16It adds features which were missing in Boost.MPI (which only covers MPI-1), with an iterator-based interface and MPI-3 features (RMA and Shared memory).
17[Boost].MPI3 is written from scratch in C++14.
18
19[Boost].MPI3 depends and has been compiled against Boost +1.53 and one of the MPI library implementations, OpenMPI +1.9, MPICH +3.2.1 or MVAPICH, using the following compilers gcc +5.4.1, clang +6.0, PGI 18.04.
20The current version of the library (wrapper) is `0.71`, (programmatically accesible from `./version.hpp`).
21
22## Introduction
23
24MPI is a large library for run-time parallelism where several paradigms coexist.
25It was is originally designed as standardized and portable message-passing system to work on a wide variety of parallel computing architectures.
26
27The last standard, MPI-3, uses a combination of techniques to achieve parallelism, Message Passing (MP), (Remote Memory Access (RMA) and Shared Memory (SM).
28We try here to give a uniform interface and abstractions for these features by means of wrapper function calls and concepts brought familiar to C++ and the STL.
29
30## Motivation: The problem with the standard interface
31
32A typical C-call for MP looks like this,
33
34```c++
35int status_send = MPI_Send(&numbers, 10, MPI_INT, 1, 0, MPI_COMM_WORLD);
36assert(status_send == MPI_SUCCESS);
37... // concurrently with
38int status_recv = MPI_Recv(&numbers, 10, MPI_INT, 0, 0, MPI_COMM_WORLD, MPI_STATUS_IGNORE);
39assert(status_recv == MPI_SUCCESS);
40```
41
42In principle this call can be made from a C++ program.
43However there are obvious drawbacks from using this standard interface.
44
45Here we enumerate some of problems,
46
47* Function calls have many arguments (e.g. 6 or 7 arguments in average)
48* Many mandatory arguments are redundant or could easily have a default natural value ( e.g. message tags are not always necessary).
49* Use of raw pointers and sizes, (e.g. `&number` and `1`)
50* Argument is type-erased by `void*`.
51* Only primitive types (e.g. `MPI_INT`) can be passed.
52* Consistency between pointer types and data-types is responsibility of the user.
53* Only contiguous memory blocks can be used with this interface.
54* Error codes are stored and had to be checked after each function call.
55* Use of handles (such as `MPI_COMM_WORLD`), handles do not have a well defined semantics.
56
57A call of this type would be an improvement:
58
59```c++
60world.send(numbers.begin(), numbers.end(), 1);
61... // concurrently with
62world.receive(numbers.begin(), numbers.end(), 0);
63```
64
65For other examples, see here: [http://mpitutorial.com/tutorials/mpi-send-and-receive/](http://mpitutorial.com/tutorials/mpi-send-and-receive/)
66
67MPI used to ship with a C++-style interfaces.
68It turns out that this interface was a very minimal change over the C version, and for good reasons it was dropped.
69
70The Boost.MPI3 library was designed to use simultaneously (interleaved) with the standard C interface of MPI.
71In this way, changes to existing code can be made incrementally.
72Mixing the standard C interface with the Boost.MPI3 is not complicated but requires more knowledge of the library internals than the one provided in this document.
73
74## Installation
75
76The library is "header-only"; no separate compilation is necessary.
77Most functions are inline or template functions.
78In order to compile it requires an MPI distribution (e.g. OpenMPI or MPICH2) and the corresponding compiler-wrapper (`mpic++` or `mpicxx`).
79Currently the library requieres C++14 (usually activated with the compiler option `-std=c++14`) and Boost. In particular it depends on Boost.Serialization and may require linking to this library if values passed are not basic types (`-lboost_serialization`). A typical compilation/run command looks like this:
80
81```bash
82$ mpic++ -std=c++14 -O3 mpi3/test/communicator_send.cpp -o communicator_send.x -lboost_serialization
83$ mpirun -n 8 ./communicator_send.x
84```
85
86In a system such as Red Hat, the dependencies can by installed by
87
88```bash
89$ dnf install gcc-c++ boost-devel openmpi-devel mpich-devel
90```
91
92The library is tested frequently against `openmpi` and `mpich`, and less frequently with `mvapich2`.
93
94## Testing
95
96The library has a basic `ctest` based testing system.
97
98```c++
99cd mpi3/test
100mkdir build; cd build
101cmake .. && make && ctest
102```
103
104## Initialization
105
106Like MPI, Boost.MPI3 requires some global library initialization.
107The library includes `mpi3/main.hpp` which wraps around this initialization steps and *simulates* a main function.
108In this way, a parallel program looks very much like normal programs, except that the main function has a third argument with the default global communicator passed in.
109
110```c++
111#include "mpi3/version.hpp"
112#include "mpi3/main.hpp"
113
114#include<iostream>
115
116namespace mpi3 = boost::mpi3;
117using std::cout;
118
119int mpi3::main(int argc, char* argv[], mpi3::communicator world){
120	if(world.rank() == 0) cout << mpi3::version() << '\n';
121	return 0;
122}
123```
124
125Here `world` is a communicator object that is a wrapper over MPI communicator handle.
126
127Changing the `main` program to this syntax in existing code can be too intrusive.
128For this reason a more traditional initialization is also possible.
129The alternative initialization is done by instantiating the `mpi3::environment` object (from with the global communicator `.world()` is extracted).
130
131```c++
132#include "mpi3/environment.hpp"
133int main(int argc, char** argv){
134	mpi3::environment env(argc, argv);
135	auto world = env.world(); // communicator is extracted from the environment
136    // ... code here
137	return 0;
138}
139```
140
141## Communicators
142
143In the last example, `world` is a global communicator (not necessarely the same as `MPI_COMM_WORLD`, but a copy of it).
144There is no global communicator variable `world` that can be accessed directly in a nested function.
145The idea behind this is to avoid using the global communicators in nested functions of the program unless they are explicitly passed in the function call.
146Communicators are usually passed by reference to nested functions.
147Even in traditional MPI it is a mistake to assume that the `MPI_COMM_WORLD` is the only available communicator.
148
149`mpi3::communicator` represent communicators with value-semantics.
150This means that `mpi3::communicator` can be copied or passed by reference.
151A communicator and their copies are different entities that compare equal.
152Communicators can be empty, in a state that is analogous to `MPI_COMM_NULL` but with proper value semantics.
153
154Like in MPI communicators can be duplicated (copied into a new instance) or split.
155They can be also compared.
156
157```c++
158mpi3::communicator world2 = world;
159assert( world2 == world );
160mpi3::communicator hemisphere = world/2;
161mpi3::communicator interleaved = world%2;
162```
163
164This program for example splits the global communicator in two sub-communicators one of size 2 (including process 0 and 1) and one with size 6 (including 2, 3, ... 7);
165
166```c++
167#include "mpi3/main.hpp"
168#include "mpi3/communicator.hpp"
169
170namespace mpi3 = boost::mpi3;
171using std::cout;
172
173int mpi3::main(int argc, char* argv[], mpi3::communicator world){
174    assert(world.size() == 8); // this program can only be run in 8 processes
175    mpi3::communicator comm = (world <= 1);
176    assert(!comm || (comm && comm.size() == 2));
177    return 0;
178}
179```
180
181Communicators give also index access to individual `mpi3::processes` ranging from `0` to `comm.size()`.
182For example, `world[0]` referrers to process 0 or the global communicator.
183An `mpi3::process` is simply a rank inside a communicator.
184This concept doesn't exist explicit in the standard C interface, but it simplifies the syntax for message passing.
185
186Splitting communicators can be done more traditionally via the `communicator::split` member function.
187
188Communicators are used to pass messages and to create memory windows.
189A special type of communicator is a shared-communicator `mpi3::shared_communicator`.
190
191## Message Passing
192
193This section describes the features related to the message passing (MP) functions in the MPI library.
194In C-MPI information is passed via pointers to memory.
195This is expected in a C-based interface and it is also very efficient.
196In Boost.MPI, information is passed exclusively by value semantics.
197Although there are optimizations that amortize the cost, we decided to generalize the pointer interface and leave the value-based message passing for a higher-level syntax.
198
199Here we replicate the design of STL to process information, that is, aggregated data is passed mainly via iterators. (Pointer is a type of iterator).
200
201For example in STL data is copied between ranges in this way.
202```c++
203std::copy(origin.begin(), origin.end(), destination.begin());
204```
205
206The caller of function copy doesn't need to worry about he type of the `origin` and `destination` containers, it can mix pointers and iterators and the function doesn't need more redundant information than the information passed.
207The programmer is responsible for managing the memory and making sure that design is such that the algorithm can access the data referred by the passed iterators.
208
209Contiguous iterators (to built-in types) are particularity efficient because they can be mapped to pointers at compile time. This in turn is translated into a MPI primitive function call.
210The interface for other type of iterators or contiguous iterators to non-build-in type are simulated, mainly via buffers and serialization.
211The idea behind this is that generic message passing function calls can be made to work with arbitrary data types.
212
213The main interface for message passing in Boost.MPI3 are member functions of the communicator.
214For example `communicator::send`, `::receive` and `::barrier`.
215The functions `::rank` and `::size` allows each process to determine their unique identity inside the communicator.
216
217```c++
218int mpi3::main(int argc, char* argv[], mpi3::communicator& world){
219    assert(world.size() == 2);
220	if(world.rank() == 0){
221	   std::vector<double> v = {1.,2.,3.};
222	   world.send(v.begin(), v.end(), 1); // send to rank 1
223	}else if(world.rank() == 1){
224	   std::vector<double> v(3);
225	   world.receive(v.begin(), v.end(), 0); // receive from rank 1
226	   assert( v == std::vector{1.,2.,3.} );
227	}
228	world.barrier(); // synchronize execution here
229	return 0;
230}
231```
232
233Other important functions are `::gather`, `::broadcast` and `::accumulate`.
234This syntax has a more or less obvious (but simplified) mapping to the standard C-MPI interface.
235In Boost.MPI3 however all, these functions have reasonable defaults that make the function call shorted and less prone to errors and with the C-MPI interface.
236
237For more examples, look into `./mpi3/tests/`, `./mpi3/examples/` and `./mpi3/exercises/`.
238
239The interface described above is iterator based and is a direct generalization of the C-interface which works with pointers.
240If the iterators are contiguous and the associated value types are primitive MPI types, the function is directly mapped to the C-MPI call.
241
242Alternatively, value-based interface can be used.
243We will show the terse syntax, using the process objects.
244
245```c++
246int mpi3::main(int argc, char* argv[], mpi3::communicator& world){
247    assert(world.size() == 2);
248	if(world.rank() == 0){
249	   double v = 5.;
250	   world[1] << v;
251	}else if(world.rank() == 1){
252	   double v = -1.;
253	   world[0] >> v;
254	   assert(v == 5.);
255	}
256	return 0;
257}
258```
259
260## Remote Memory Access
261
262Remote Memory (RM) is handled by `mpi3::window` objects.
263`mpi3::window`s are created by `mpi3::communicator` via a collective (member) functions.
264Since `mpi3::window`s represent memory, it cannot be copied (but can be moved).
265
266```c++
267mpi3::window w = world.make_window(begin, end);
268```
269
270Just like in the MPI interface, local access and remote access is synchronized by a `window::fence` call.
271Read and write remote access is performed via put and get functions.
272
273```c++
274w.fence();
275w.put(begin, end, rank);
276w.fence();
277```
278
279This is minimal example using `put` and `get` functions.
280
281```c++
282#include "mpi3/main.hpp"
283#include<iostream>
284
285namespace mpi3 = boost::mpi3; using std::cout;
286
287int mpi3::main(int, char*[], mpi3::communicator world){
288
289	std::vector<double> darr(world.rank()?0:100);
290	mpi3::window<double> w = world.make_window(darr.data(), darr.size());
291	w.fence();
292	if(world.rank() == 0){
293		std::vector<double> a = {5., 6.};
294		w.put(a.begin(), a.end(), 0);
295	}
296	world.barrier();
297	w.fence();
298	std::vector<double> b(2);
299	w.get(b.begin(), b.end(), 0);
300	w.fence();
301	assert( b[0] == 5.);
302	world.barrier();
303
304	return 0;
305}
306```
307
308In this example, memory from process 0 is shared across the communicator, and accessible through a common window.
309Process 0 writes (`window::put`s) values in the memory (this can be done locally or remotely).
310Later all processes read from this memory.
311`put` and `get` functions take at least 3 arguments (and at most 4).
312The first two is a range of iterators, while the third is the destination/source process rank (called "target_rank").
313
314Relevant examples and test are located in For more examples, look into `./mpi3/tests/`, `./mpi3/examples/` and `./mpi3/exercises/`.
315
316`mpi3::window`s may carry type information (as `mpi3::window<double>`) or not (`mpi3::window<>`)
317
318## Shared Memory
319
320Shared memory (SM) uses the underlying capability of the operating system to share memory from process within the same node.
321Historically shared memory has an interface similar to that of remove access.
322Only communicators that comprise a single node can be used to create a share memory window.
323A special type of communicator can be created by splitting a given communicator.
324
325`mpi3::shared_communicator node = world.split_shared();`
326
327If the job is launched in single node, `node` will be equal (congruent) to `world`.
328Otherwise the global communicator will be split into a number of (shared) communicators equal to the number of nodes.
329
330`mpi3::shared_communicator`s can create `mpi3::shared_window`s.
331These are special type of memory windows.
332
333```c++
334#include "mpi3/main.hpp"
335
336namespace mpi3 = boost::mpi3; using std::cout;
337
338int mpi3::main(int argc, char* argv[], mpi3::communicator& world){
339
340	mpi3::shared_communicator node = world.split_shared();
341	mpi3::shared_window<int> win = node.make_shared_window<int>(node.rank()==0?1:0);
342
343	assert(win.base() != nullptr and win.size<int>() == 1);
344
345	win.lock_all();
346	if(node.rank()==0) *win.base<int>(0) = 42;
347	for (int j=1; j != node.size(); ++j){
348		if(node.rank()==0) node.send_n((int*)nullptr, 0, j);//, 666);
349	    else if(node.rank()==j) node.receive_n((int*)nullptr, 0, 0);//, 666);
350	}
351	win.sync();
352
353	int l = *win.base<int>(0);
354	win.unlock_all();
355
356	int minmax[2] = {-l,l};
357	node.all_reduce_n(&minmax[0], 2, mpi3::max<>{});
358	assert( -minmax[0] == minmax[1] );
359	cout << "proc " << node.rank() << " " << l << std::endl;
360
361	return 0;
362}
363```
364
365For more examples, look into `./mpi3/tests/`, `./mpi3/examples/` and `./mpi3/exercises/`.
366
367# Beyond MP, RMA and SHM
368
369MPI provides a very low level abstraction to inter-process communication.
370Higher level of abstractions can be constructed on top of MPI and by using the wrapper the works is simplified considerably.
371
372## Mutex
373
374Mutexes can be implemented fairly simply on top of RMA.
375Mutexes are used similarly than in threaded code,
376it prevents certain blocks of code to be executed by more than one process (rank) at a time.
377
378```c++
379#include "mpi3/main.hpp"
380#include "mpi3/mutex.hpp"
381
382#include<iostream>
383
384namespace mpi3 = boost::mpi3; using std::cout;
385
386int mpi3::main(int argc, char* argv[], mpi3::communicator& world){
387
388	mpi3::mutex m(world);
389	{
390		m.lock();
391		cout << "locked from " << world.rank() << '\n';
392		cout << "never interleaved " << world.rank() << '\n';
393		cout << "forever blocked " << world.rank() << '\n';
394		cout << std::endl;
395		m.unlock();
396	}
397	return 0;
398}
399```
400
401(Recursive mutexes are not implemented yet)
402
403Mutexes themselves can be used to implement atomic operations on data.
404
405# Ongoing work
406
407We are implementing memory allocators for remote memory, atomic classes and asynchronous remote function calls.
408Higher abstractions and use patterns will be implemented, specially those that fit into the patterns of the STL algorithms and containers.
409
410# Conclusion
411
412The goal is to provide a type-safe, efficient, generic interface for MPI.
413We achieve this by leveraging template code and classes that C++ provides.
414Typical low-level use patterns become extremely simple, and that exposes higher-level patterns.
415