1## DIY is a block-parallel library
2
3DIY is a block-parallel library for implementing scalable algorithms that can execute both
4in-core and out-of-core. The same program can be executed with one or more threads per MPI
5process, seamlessly combining distributed-memory message passing with shared-memory thread
6parallelism.  The abstraction enabling these capabilities is block parallelism; blocks
7and their message queues are mapped onto processing elements (MPI processes or threads) and are
8migrated between memory and storage by the DIY runtime. Complex communication patterns,
9including neighbor exchange, merge reduction, swap reduction, and all-to-all exchange, are
10possible in- and out-of-core in DIY.
11
12## Licensing
13
14DIY is released as open source software under a BSD-style [license](./LICENSE.txt).
15
16## Dependencies
17
18DIY requires an MPI installation. We recommend [MPICH](http://www.mpich.org/).
19
20## Download, build, install
21
22- You can clone this repository, or
23
24- You can download the [latest tarball](https://github.com/diatomic/diy2/archive/master.tar.gz).
25
26
27DIY is a header-only library. It does not need to be built; you can simply
28include it in your project. The examples can be built using `cmake` from the
29top-level directory.
30
31## Documentation
32
33[Doxygen pages](https://diatomic.github.io/diy)
34
35## Example
36
37A simple DIY program, shown below, consists of the following components:
38
39- `struct`s called blocks,
40- a diy object called the `master`,
41- a set of callback functions performed on each block by `master.foreach()`,
42- optionally, one or more message exchanges between the blocks by `master.exchange()`, and
43- there may be other collectives and global reductions not shown below.
44
45The callback functions (`enqueue_local()` and `average()` in the example below) receive the block
46pointer and a communication proxy for the message exchange between blocks. It is usual for the
47callback functions to enqueue or dequeue messages from the proxy, so that information can be
48received and sent during rounds of message exchange.
49
50```cpp
51    // --- main program --- //
52
53    struct Block { float local, average; };             // define your block structure
54
55    Master master(world);                               // world = MPI_Comm
56    ...                                                 // populate master with blocks
57    master.foreach(&enqueue_local);                     // call enqueue_local() for each block
58    master.exchange();                                  // exchange enqueued data between blocks
59    master.foreach(&average);                           // call average() for each block
60
61    // --- callback functions --- //
62
63    // enqueue block data prior to exchanging it
64    void enqueue_local(Block* b,                        // current block
65                       const Proxy& cp)                 // communication proxy provides access to the neighbor blocks
66    {
67        for (size_t i = 0; i < cp.link()->size(); i++)  // for all neighbor blocks
68            cp.enqueue(cp.link()->target(i), b->local); // enqueue the data to be sent to this neighbor
69                                                        // block in the next exchange
70    }
71
72    // use the received data after exchanging it, in this case compute its average
73    void average(Block* b,                              // current block
74                 const Proxy& cp)                       // communication proxy provides access to the neighbor blocks
75    {
76        float x, average = 0;
77        for (size_t i = 0; i < cp.link()->size(); i++)  // for all neighbor blocks
78        {
79            cp.dequeue(cp.link()->target(i).gid, x);    // dequeue the data received from this
80                                                        // neighbor block in the last exchange
81            average += x;
82        }
83        b->average = average / cp.link()->size();
84    }
85```
86