|  | Home | Libraries | People | FAQ | More | 
      A Boost.MPI program consists of many cooperating processes (possibly running
      on different computers) that communicate among themselves by passing messages.
      Boost.MPI is a library (as is the lower-level MPI), not a language, so the
      first step in a Boost.MPI is to create an mpi::environment
      object that initializes the MPI environment and enables communication among
      the processes. The mpi::environment
      object is initialized with the program arguments (which it may modify) in your
      main program. The creation of this object initializes MPI, and its destruction
      will finalize MPI. In the vast majority of Boost.MPI programs, an instance
      of mpi::environment will
      be declared in main at the
      very beginning of the program.
    
      Communication with MPI always occurs over a communicator,
      which can be created be simply default-constructing an object of type mpi::communicator. This communicator
      can then be queried to determine how many processes are running (the "size"
      of the communicator) and to give a unique number to each process, from zero
      to the size of the communicator (i.e., the "rank" of the process):
    
#include <boost/mpi/environment.hpp> #include <boost/mpi/communicator.hpp> #include <iostream> namespace mpi = boost::mpi; int main() { mpi::environment env; mpi::communicator world; std::cout << "I am process " << world.rank() << " of " << world.size() << "." << std::endl; return 0; }
If you run this program with 7 processes, for instance, you will receive output such as:
I am process 5 of 7. I am process 0 of 7. I am process 1 of 7. I am process 6 of 7. I am process 2 of 7. I am process 4 of 7. I am process 3 of 7.
Of course, the processes can execute in a different order each time, so the ranks might not be strictly increasing. More interestingly, the text could come out completely garbled, because one process can start writing "I am a process" before another process has finished writing "of 7.".
If you should still have an MPI library supporting only MPI 1.1 you will need to pass the command line arguments to the environment constructor as shown in this example:
#include <boost/mpi/environment.hpp> #include <boost/mpi/communicator.hpp> #include <iostream> namespace mpi = boost::mpi; int main(int argc, char* argv[]) { mpi::environment env(argc, argv); mpi::communicator world; std::cout << "I am process " << world.rank() << " of " << world.size() << "." << std::endl; return 0; }
As a message passing library, MPI's primary purpose is to routine messages from one process to another, i.e., point-to-point. MPI contains routines that can send messages, receive messages, and query whether messages are available. Each message has a source process, a target process, a tag, and a payload containing arbitrary data. The source and target processes are the ranks of the sender and receiver of the message, respectively. Tags are integers that allow the receiver to distinguish between different messages coming from the same sender.
        The following program uses two MPI processes to write "Hello, world!"
        to the screen (hello_world.cpp):
      
#include <boost/mpi.hpp> #include <iostream> #include <string> #include <boost/serialization/string.hpp> namespace mpi = boost::mpi; int main() { mpi::environment env; mpi::communicator world; if (world.rank() == 0) { world.send(1, 0, std::string("Hello")); std::string msg; world.recv(1, 1, msg); std::cout << msg << "!" << std::endl; } else { std::string msg; world.recv(0, 0, msg); std::cout << msg << ", "; std::cout.flush(); world.send(0, 1, std::string("world")); } return 0; }
        The first processor (rank 0) passes the message "Hello" to the
        second processor (rank 1) using tag 0. The second processor prints the string
        it receives, along with a comma, then passes the message "world"
        back to processor 0 with a different tag. The first processor then writes
        this message with the "!" and exits. All sends are accomplished
        with the communicator::send
        method and all receives use a corresponding communicator::recv
        call.
      
          The default MPI communication operations--send
          and recv--may have to wait
          until the entire transmission is completed before they can return. Sometimes
          this blocking behavior has a negative
          impact on performance, because the sender could be performing useful computation
          while it is waiting for the transmission to occur. More important, however,
          are the cases where several communication operations must occur simultaneously,
          e.g., a process will both send and receive at the same time.
        
Let's revisit our "Hello, world!" program from the previous section. The core of this program transmits two messages:
if (world.rank() == 0) { world.send(1, 0, std::string("Hello")); std::string msg; world.recv(1, 1, msg); std::cout << msg << "!" << std::endl; } else { std::string msg; world.recv(0, 0, msg); std::cout << msg << ", "; std::cout.flush(); world.send(0, 1, std::string("world")); }
          The first process passes a message to the second process, then prepares
          to receive a message. The second process does the send and receive in the
          opposite order. However, this sequence of events is just that--a sequence--meaning that there is essentially no parallelism.
          We can use non-blocking communication to ensure that the two messages are
          transmitted simultaneously (hello_world_nonblocking.cpp):
        
#include <boost/mpi.hpp> #include <iostream> #include <string> #include <boost/serialization/string.hpp> namespace mpi = boost::mpi; int main() { mpi::environment env; mpi::communicator world; if (world.rank() == 0) { mpi::request reqs[2]; std::string msg, out_msg = "Hello"; reqs[0] = world.isend(1, 0, out_msg); reqs[1] = world.irecv(1, 1, msg); mpi::wait_all(reqs, reqs + 2); std::cout << msg << "!" << std::endl; } else { mpi::request reqs[2]; std::string msg, out_msg = "world"; reqs[0] = world.isend(0, 1, out_msg); reqs[1] = world.irecv(0, 0, msg); mpi::wait_all(reqs, reqs + 2); std::cout << msg << ", "; } return 0; }
          We have replaced calls to the communicator::send
          and communicator::recv
          members with similar calls to their non-blocking counterparts, communicator::isend
          and communicator::irecv.
          The prefix i indicates that the operations
          return immediately with a mpi::request
          object, which allows one to query the status of a communication request
          (see the test
          method) or wait until it has completed (see the wait
          method). Multiple requests can be completed at the same time with the
          wait_all operation.
        
If you run this program multiple times, you may see some strange results: namely, some runs will produce:
Hello, world!
while others will produce:
world! Hello,
or even some garbled version of the letters in "Hello" and "world". This indicates that there is some parallelism in the program, because after both messages are (simultaneously) transmitted, both processes will concurrent execute their print statements. For both performance and correctness, non-blocking communication operations are critical to many parallel applications using MPI.
          The inclusion of boost/serialization/string.hpp
          in the previous examples is very important: it makes values of type std::string serializable, so that they can
          be be transmitted using Boost.MPI. In general, built-in C++ types (ints, floats,
          characters, etc.) can be transmitted over MPI directly, while user-defined
          and library-defined types will need to first be serialized (packed) into
          a format that is amenable to transmission. Boost.MPI relies on the Boost.Serialization
          library to serialize and deserialize data types.
        
          For types defined by the standard library (such as std::string
          or std::vector) and some types in Boost (such
          as boost::variant), the Boost.Serialization
          library already contains all of the required serialization code. In these
          cases, you need only include the appropriate header from the boost/serialization directory.
        
          For types that do not already have a serialization header, you will first
          need to implement serialization code before the types can be transmitted
          using Boost.MPI. Consider a simple class gps_position
          that contains members degrees,
          minutes, and seconds. This class is made serializable
          by making it a friend of boost::serialization::access
          and introducing the templated serialize() function, as follows:
        
class gps_position { private: friend class boost::serialization::access; template<class Archive> void serialize(Archive & ar, const unsigned int version) { ar & degrees; ar & minutes; ar & seconds; } int degrees; int minutes; float seconds; public: gps_position(){}; gps_position(int d, int m, float s) : degrees(d), minutes(m), seconds(s) {} };
Complete information about making types serializable is beyond the scope of this tutorial. For more information, please see the Boost.Serialization library tutorial from which the above example was extracted. One important side benefit of making types serializable for Boost.MPI is that they become serializable for any other usage, such as storing the objects to disk to manipulated them in XML.
          Some serializable types, like gps_position
          above, have a fixed amount of data stored at fixed field positions. When
          this is the case, Boost.MPI can optimize their serialization and transmission
          to avoid extraneous copy operations. To enable this optimization, users
          should specialize the type trait is_mpi_datatype, e.g.:
        
namespace boost { namespace mpi { template <> struct is_mpi_datatype<gps_position> : mpl::true_ { }; } }
For non-template types we have defined a macro to simplify declaring a type as an MPI datatype
BOOST_IS_MPI_DATATYPE(gps_position)
          For composite traits, the specialization of is_mpi_datatype may depend
          on is_mpi_datatype itself.
          For instance, a boost::array object is fixed only when the type
          of the parameter it stores is fixed:
        
namespace boost { namespace mpi { template <typename T, std::size_t N> struct is_mpi_datatype<array<T, N> > : public is_mpi_datatype<T> { }; } }
The redundant copy elimination optimization can only be applied when the shape of the data type is completely fixed. Variable-length types (e.g., strings, linked lists) and types that store pointers cannot use the optimiation, but Boost.MPI will be unable to detect this error at compile time. Attempting to perform this optimization when it is not correct will likely result in segmentation faults and other strange program behavior.
          Boost.MPI can transmit any user-defined data type from one process to another.
          Built-in types can be transmitted without any extra effort; library-defined
          types require the inclusion of a serialization header; and user-defined
          types will require the addition of serialization code. Fixed data types
          can be optimized for transmission using the is_mpi_datatype type trait.
        
Point-to-point operations are the core message passing primitives in Boost.MPI. However, many message-passing applications also require higher-level communication algorithms that combine or summarize the data stored on many different processes. These algorithms support many common tasks such as "broadcast this value to all processes", "compute the sum of the values on all processors" or "find the global minimum."
          The broadcast
          algorithm is by far the simplest collective operation. It broadcasts a
          value from a single process to all other processes within a communicator. For instance,
          the following program broadcasts "Hello, World!" from process
          0 to every other process. (hello_world_broadcast.cpp)
        
#include <boost/mpi.hpp> #include <iostream> #include <string> #include <boost/serialization/string.hpp> namespace mpi = boost::mpi; int main() { mpi::environment env; mpi::communicator world; std::string value; if (world.rank() == 0) { value = "Hello, World!"; } broadcast(world, value, 0); std::cout << "Process #" << world.rank() << " says " << value << std::endl; return 0; }
Running this program with seven processes will produce a result such as:
Process #0 says Hello, World! Process #2 says Hello, World! Process #1 says Hello, World! Process #4 says Hello, World! Process #3 says Hello, World! Process #5 says Hello, World! Process #6 says Hello, World!
          The gather
          collective gathers the values produced by every process in a communicator
          into a vector of values on the "root" process (specified by an
          argument to gather). The
          /i/th element in the vector will correspond to the value gathered fro mthe
          /i/th process. For instance, in the following program each process computes
          its own random number. All of these random numbers are gathered at process
          0 (the "root" in this case), which prints out the values that
          correspond to each processor. (random_gather.cpp)
        
#include <boost/mpi.hpp> #include <iostream> #include <vector> #include <cstdlib> namespace mpi = boost::mpi; int main() { mpi::environment env; mpi::communicator world; std::srand(time(0) + world.rank()); int my_number = std::rand(); if (world.rank() == 0) { std::vector<int> all_numbers; gather(world, my_number, all_numbers, 0); for (int proc = 0; proc < world.size(); ++proc) std::cout << "Process #" << proc << " thought of " << all_numbers[proc] << std::endl; } else { gather(world, my_number, 0); } return 0; }
          Executing this program with seven processes will result in output such
          as the following. Although the random values will change from one run to
          the next, the order of the processes in the output will remain the same
          because only process 0 writes to std::cout.
        
Process #0 thought of 332199874 Process #1 thought of 20145617 Process #2 thought of 1862420122 Process #3 thought of 480422940 Process #4 thought of 1253380219 Process #5 thought of 949458815 Process #6 thought of 650073868
          The gather operation collects
          values from every process into a vector at one process. If instead the
          values from every process need to be collected into identical vectors on
          every process, use the all_gather algorithm,
          which is semantically equivalent to calling gather
          followed by a broadcast
          of the resulting vector.
        
          The reduce
          collective summarizes the values from each process into a single value
          at the user-specified "root" process. The Boost.MPI reduce operation is similar in spirit
          to the STL accumulate operation, because
          it takes a sequence of values (one per process) and combines them via a
          function object. For instance, we can randomly generate values in each
          process and the compute the minimum value over all processes via a call
          to reduce
          (random_min.cpp):
        
#include <boost/mpi.hpp> #include <iostream> #include <cstdlib> namespace mpi = boost::mpi; int main() { mpi::environment env; mpi::communicator world; std::srand(time(0) + world.rank()); int my_number = std::rand(); if (world.rank() == 0) { int minimum; reduce(world, my_number, minimum, mpi::minimum<int>(), 0); std::cout << "The minimum value is " << minimum << std::endl; } else { reduce(world, my_number, mpi::minimum<int>(), 0); } return 0; }
          The use of mpi::minimum<int>
          indicates that the minimum value should be computed. mpi::minimum<int> is a binary function object that compares
          its two parameters via <
          and returns the smaller value. Any associative binary function or function
          object will work. For instance, to concatenate strings with reduce one could use the function object
          std::plus<std::string>
          (string_cat.cpp):
        
#include <boost/mpi.hpp> #include <iostream> #include <string> #include <functional> #include <boost/serialization/string.hpp> namespace mpi = boost::mpi; int main() { mpi::environment env; mpi::communicator world; std::string names[10] = { "zero ", "one ", "two ", "three ", "four ", "five ", "six ", "seven ", "eight ", "nine " }; std::string result; reduce(world, world.rank() < 10? names[world.rank()] : std::string("many "), result, std::plus<std::string>(), 0); if (world.rank() == 0) std::cout << "The result is " << result << std::endl; return 0; }
In this example, we compute a string for each process and then perform a reduction that concatenates all of the strings together into one, long string. Executing this program with seven processors yields the following output:
The result is zero one two three four five six
          Any kind of binary function objects can be used with reduce.
          For instance, and there are many such function objects in the C++ standard
          <functional> header and the Boost.MPI header <boost/mpi/operations.hpp>. Or, you can create your own function
          object. Function objects used with reduce
          must be associative, i.e. f(x,
          f(y, z)) must be equivalent to f(f(x, y), z). If they are also commutative (i..e,
          f(x, y) == f(y,
          x)),
          Boost.MPI can use a more efficient implementation of reduce.
          To state that a function object is commutative, you will need to specialize
          the class is_commutative.
          For instance, we could modify the previous example by telling Boost.MPI
          that string concatenation is commutative:
        
namespace boost { namespace mpi { template<> struct is_commutative<std::plus<std::string>, std::string> : mpl::true_ { }; } } // end namespace boost::mpi
          By adding this code prior to main(), Boost.MPI will assume that string concatenation
          is commutative and employ a different parallel algorithm for the reduce operation. Using this algorithm,
          the program outputs the following when run with seven processes:
        
The result is zero one four five six two three
          Note how the numbers in the resulting string are in a different order:
          this is a direct result of Boost.MPI reordering operations. The result
          in this case differed from the non-commutative result because string concatenation
          is not commutative: f("x",
          "y")
          is not the same as f("y",
          "x"),
          because argument order matters. For truly commutative operations (e.g.,
          integer addition), the more efficient commutative algorithm will produce
          the same result as the non-commutative algorithm. Boost.MPI also performs
          direct mappings from function objects in <functional>
          to MPI_Op values predefined
          by MPI (e.g., MPI_SUM,
          MPI_MAX); if you have your
          own function objects that can take advantage of this mapping, see the class
          template is_mpi_op.
        
          Like gather,
          reduce has an "all"
          variant called all_reduce that performs
          the reduction operation and broadcasts the result to all processes. This
          variant is useful, for instance, in establishing global minimum or maximum
          values.
        
          The following code (global_min.cpp)
          shows a broadcasting version of the random_min.cpp
          example:
        
#include <boost/mpi.hpp> #include <iostream> #include <cstdlib> namespace mpi = boost::mpi; int main(int argc, char* argv[]) { mpi::environment env(argc, argv); mpi::communicator world; std::srand(world.rank()); int my_number = std::rand(); int minimum; all_reduce(world, my_number, minimum, mpi::minimum<int>()); if (world.rank() == 0) { std::cout << "The minimum value is " << minimum << std::endl; } return 0; }
          In that example we provide both input and output values, requiring twice
          as much space, which can be a problem depending on the size of the transmitted
          data. If there is no need to preserve the input value, the ouput value
          can be omitted. In that case the input value will be overriden with the
          output value and Boost.MPI is able, in some situation, to implement the
          operation with a more space efficient solution (using the MPI_IN_PLACE flag of the MPI C mapping),
          as in the following example (in_place_global_min.cpp):
        
#include <boost/mpi.hpp> #include <iostream> #include <cstdlib> namespace mpi = boost::mpi; int main(int argc, char* argv[]) { mpi::environment env(argc, argv); mpi::communicator world; std::srand(world.rank()); int my_number = std::rand(); all_reduce(world, my_number, mpi::minimum<int>()); if (world.rank() == 0) { std::cout << "The minimum value is " << my_number << std::endl; } return 0; }
Communication with Boost.MPI always occurs over a communicator. A communicator contains a set of processes that can send messages among themselves and perform collective operations. There can be many communicators within a single program, each of which contains its own isolated communication space that acts independently of the other communicators.
        When the MPI environment is initialized, only the "world" communicator
        (called MPI_COMM_WORLD in
        the MPI C and Fortran bindings) is available. The "world" communicator,
        accessed by default-constructing a mpi::communicator
        object, contains all of the MPI processes present when the program begins
        execution. Other communicators can then be constructed by duplicating or
        building subsets of the "world" communicator. For instance, in
        the following program we split the processes into two groups: one for processes
        generating data and the other for processes that will collect the data. (generate_collect.cpp)
      
#include <boost/mpi.hpp> #include <iostream> #include <cstdlib> #include <boost/serialization/vector.hpp> namespace mpi = boost::mpi; enum message_tags {msg_data_packet, msg_broadcast_data, msg_finished}; void generate_data(mpi::communicator local, mpi::communicator world); void collect_data(mpi::communicator local, mpi::communicator world); int main() { mpi::environment env; mpi::communicator world; bool is_generator = world.rank() < 2 * world.size() / 3; mpi::communicator local = world.split(is_generator? 0 : 1); if (is_generator) generate_data(local, world); else collect_data(local, world); return 0; }
        When communicators are split in this way, their processes retain membership
        in both the original communicator (which is not altered by the split) and
        the new communicator. However, the ranks of the processes may be different
        from one communicator to the next, because the rank values within a communicator
        are always contiguous values starting at zero. In the example above, the
        first two thirds of the processes become "generators" and the remaining
        processes become "collectors". The ranks of the "collectors"
        in the world communicator
        will be 2/3 world.size()
        and greater, whereas the ranks of the same collector processes in the local communicator will start at zero.
        The following excerpt from collect_data() (in generate_collect.cpp) illustrates
        how to manage multiple communicators:
      
mpi::status msg = world.probe(); if (msg.tag() == msg_data_packet) { // Receive the packet of data std::vector<int> data; world.recv(msg.source(), msg.tag(), data); // Tell each of the collectors that we'll be broadcasting some data for (int dest = 1; dest < local.size(); ++dest) local.send(dest, msg_broadcast_data, msg.source()); // Broadcast the actual data. broadcast(local, data, 0); }
        The code in this except is executed by the "master" collector,
        e.g., the node with rank 2/3 world.size()
        in the world communicator
        and rank 0 in the local (collector)
        communicator. It receives a message from a generator via the world communicator, then broadcasts the
        message to each of the collectors via the local
        communicator.
      
        For more control in the creation of communicators for subgroups of processes,
        the Boost.MPI group
        provides facilities to compute the union (|),
        intersection (&), and difference
        (-) of two groups, generate
        arbitrary subgroups, etc.
      
        When communicating data types over MPI that are not fundamental to MPI (such
        as strings, lists, and user-defined data types), Boost.MPI must first serialize
        these data types into a buffer and then communicate them; the receiver then
        copies the results into a buffer before deserializing into an object on the
        other end. For some data types, this overhead can be eliminated by using
        is_mpi_datatype.
        However, variable-length data types such as strings and lists cannot be MPI
        data types.
      
Boost.MPI supports a second technique for improving performance by separating the structure of these variable-length data structures from the content stored in the data structures. This feature is only beneficial when the shape of the data structure remains the same but the content of the data structure will need to be communicated several times. For instance, in a finite element analysis the structure of the mesh may be fixed at the beginning of computation but the various variables on the cells of the mesh (temperature, stress, etc.) will be communicated many times within the iterative analysis process. In this case, Boost.MPI allows one to first send the "skeleton" of the mesh once, then transmit the "content" multiple times. Since the content need not contain any information about the structure of the data type, it can be transmitted without creating separate communication buffers.
        To illustrate the use of skeletons and content, we will take a somewhat more
        limited example wherein a master process generates random number sequences
        into a list and transmits them to several slave processes. The length of
        the list will be fixed at program startup, so the content of the list (i.e.,
        the current sequence of numbers) can be transmitted efficiently. The complete
        example is available in example/random_content.cpp. We
        being with the master process (rank 0), which builds a list, communicates
        its structure via a skeleton, then repeatedly
        generates random number sequences to be broadcast to the slave processes
        via content:
      
// Generate the list and broadcast its structure std::list<int> l(list_len); broadcast(world, mpi::skeleton(l), 0); // Generate content several times and broadcast out that content mpi::content c = mpi::get_content(l); for (int i = 0; i < iterations; ++i) { // Generate new random values std::generate(l.begin(), l.end(), &random); // Broadcast the new content of l broadcast(world, c, 0); } // Notify the slaves that we're done by sending all zeroes std::fill(l.begin(), l.end(), 0); broadcast(world, c, 0);
        The slave processes have a very similar structure to the master. They receive
        (via the broadcast() call) the skeleton of the
        data structure, then use it to build their own lists of integers. In each
        iteration, they receive via another broadcast() the new content in the data structure and
        compute some property of the data:
      
// Receive the content and build up our own list std::list<int> l; broadcast(world, mpi::skeleton(l), 0); mpi::content c = mpi::get_content(l); int i = 0; do { broadcast(world, c, 0); if (std::find_if (l.begin(), l.end(), std::bind1st(std::not_equal_to<int>(), 0)) == l.end()) break; // Compute some property of the data. ++i; } while (true);
        The skeletons and content of any Serializable data type can be transmitted
        either via the send and recv members of the communicator
        class (for point-to-point communicators) or broadcast via the broadcast() collective. When separating
        a data structure into a skeleton and content, be careful not to modify the
        data structure (either on the sender side or the receiver side) without transmitting
        the skeleton again. Boost.MPI can not detect these accidental modifications
        to the data structure, which will likely result in incorrect data being transmitted
        or unstable programs.
      
To obtain optimal performance for small fixed-length data types not containing any pointers it is very important to mark them using the type traits of Boost.MPI and Boost.Serialization.
          It was alredy discussed that fixed length types containing no pointers
          can be using as is_mpi_datatype, e.g.:
        
namespace boost { namespace mpi { template <> struct is_mpi_datatype<gps_position> : mpl::true_ { }; } }
or the equivalent macro
BOOST_IS_MPI_DATATYPE(gps_position)
In addition it can give a substantial performance gain to turn off tracking and versioning for these types, if no pointers to these types are used, by using the traits classes or helper macros of Boost.Serialization:
BOOST_CLASS_TRACKING(gps_position,track_never) BOOST_CLASS_IMPLEMENTATION(gps_position,object_serializable)
More optimizations are possible on homogeneous machines, by avoiding MPI_Pack/MPI_Unpack calls but using direct bitwise copy. This feature can be enabled by defining the macro BOOST_MPI_HOMOGENEOUS when building Boost.MPI and when building the application.
In addition all classes need to be marked both as is_mpi_datatype and as is_bitwise_serializable, by using the helper macro of Boost.Serialization:
BOOST_IS_BITWISE_SERIALIZABLE(gps_position)
Usually it is safe to serialize a class for which is_mpi_datatype is true by using binary copy of the bits. The exception are classes for which some members should be skipped for serialization.
This section provides tables that map from the functions and constants of the standard C MPI to their Boost.MPI equivalents. It will be most useful for users that are already familiar with the C or Fortran interfaces to MPI, or for porting existing parallel programs to Boost.MPI.
Table 20.1. Point-to-point communication
| C Function/Constant | Boost.MPI Equivalent | 
|---|---|
| 
                   | 
                   | 
| 
                   | 
                   | 
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
Boost.MPI automatically maps C and C++ data types to their MPI equivalents. The following table illustrates the mappings between C++ types and MPI datatype constants.
Table 20.2. Datatypes
| C Constant | Boost.MPI Equivalent | 
|---|---|
| 
                   | 
                   | 
| 
                   | 
                   | 
| 
                   | 
                   | 
| 
                   | 
                   | 
| 
                   | 
                   | 
| 
                   | 
                   | 
| 
                   | 
                   | 
| 
                   | 
                   | 
| 
                   | 
                   | 
| 
                   | 
                   | 
| 
                   | 
                   | 
| 
                   | unused | 
| 
                   | used internally for serialized data types | 
| 
                   | 
                   | 
| 
                   | 
                   | 
| 
                   | 
                   | 
| 
                   | 
                   | 
| 
                   | 
                   | 
| 
                   | 
                   | 
| 
                   | 
                   | 
| 
                   | 
                   | 
        Boost.MPI does not provide direct wrappers to the MPI derived datatypes functionality.
        Instead, Boost.MPI relies on the Boost.Serialization
        library to construct MPI datatypes for user-defined classe. The section on
        user-defined data types describes
        this mechanism, which is used for types that marked as "MPI datatypes"
        using is_mpi_datatype.
      
The derived datatypes table that follows describes which C++ types correspond to the functionality of the C MPI's datatype constructor. Boost.MPI may not actually use the C MPI function listed when building datatypes of a certain form. Since the actual datatypes built by Boost.MPI are typically hidden from the user, many of these operations are called internally by Boost.MPI.
Table 20.3. Derived datatypes
| C Function/Constant | Boost.MPI Equivalent | 
|---|---|
| used automatically in Boost.MPI for MPI version 1.x | |
| used automatically in Boost.MPI for MPI version 2.0 and higher | |
| used automatically in Boost.MPI | |
| arrays | |
| used automatically in Boost.MPI | |
| used automatically in Boost.MPI | |
| any type used as a subobject | |
| unused | |
| any type used as a subobject | |
| unsupported | |
| used automatically in Boost.MPI | |
| user-defined classes and structs with MPI 1.x | |
| user-defined classes and structs with MPI 2.0 and higher | |
| unsupported | |
| used automatically in Boost.MPI | 
        MPI's packing facilities store values into a contiguous buffer, which can
        later be transmitted via MPI and unpacked into separate values via MPI's
        unpacking facilities. As with datatypes, Boost.MPI provides an abstract interface
        to MPI's packing and unpacking facilities. In particular, the two archive
        classes packed_oarchive
        and packed_iarchive
        can be used to pack or unpack a contiguous buffer using MPI's facilities.
      
Boost.MPI supports a one-to-one mapping for most of the MPI collectives. For each collective provided by Boost.MPI, the underlying C MPI collective will be invoked when it is possible (and efficient) to do so.
Table 20.5. Collectives
| C Function | Boost.MPI Equivalent | 
|---|---|
| 
                  most uses supported by  | |
| 
                  most uses supported by  | |
| 
                  most uses supported by  | |
| unsupported | |
| 
                  most uses supported by  | |
| 
                  supported implicitly by  | 
        Boost.MPI uses function objects to specify how reductions should occur in
        its equivalents to MPI_Allreduce,
        MPI_Reduce, and MPI_Scan. The following table illustrates
        how predefined
        and user-defined
        reduction operations can be mapped between the C MPI and Boost.MPI.
      
Table 20.6. Reduction operations
| C Constant | Boost.MPI Equivalent | 
|---|---|
| 
                   | |
| 
                   | |
| 
                   | |
| 
                   | 
                   | 
| 
                   | 
                   | 
| 
                   | |
| 
                   | |
| 
                   | unsupported | 
| 
                   | |
| 
                   | unsupported | 
| 
                   | used internally by Boost.MPI | 
| 
                   | used internally by Boost.MPI | 
| 
                   | 
                   | 
| 
                   | 
                   | 
        MPI defines several special communicators, including MPI_COMM_WORLD
        (including all processes that the local process can communicate with), MPI_COMM_SELF (including only the local
        process), and MPI_COMM_EMPTY
        (including no processes). These special communicators are all instances of
        the communicator
        class in Boost.MPI.
      
Table 20.7. Predefined communicators
| C Constant | Boost.MPI Equivalent | 
|---|---|
| 
                   | 
                  a default-constructed  | 
| 
                   | 
                  a  | 
| 
                   | 
                  a  | 
        Boost.MPI supports groups of processes through its group class.
      
Table 20.8. Group operations and constants
| C Function/Constant | Boost.MPI Equivalent | 
|---|---|
| 
                   | 
                  a default-constructed  | 
| 
                  memberref boost::mpi::group::rank  | |
| 
                  memberref boost::mpi::group::translate_ranks  | |
| 
                  operators  | |
| 
                   | 
                  operators  | 
| 
                   | 
                  operators  | 
| 
                   | 
                  operators  | 
| 
                  operator  | |
| 
                  operator  | |
| 
                  operator  | |
| unsupported | |
| unsupported | |
| used automatically in Boost.MPI | 
        Boost.MPI provides manipulation of communicators through the communicator class.
      
Table 20.9. Communicator operations
| C Function | Boost.MPI Equivalent | 
|---|---|
| 
                  operators  | |
| 
                   | |
| 
                   | |
| used automatically in Boost.MPI | 
        Boost.MPI currently provides support for inter-communicators via the intercommunicator
        class.
      
Table 20.10. Inter-communicator operations
| C Function | Boost.MPI Equivalent | 
|---|---|
| 
                   | |
| 
                   | |
Boost.MPI currently provides no support for attribute caching.
Table 20.11. Attributes and caching
| C Function/Constant | Boost.MPI Equivalent | 
|---|---|
| 
                   | unsupported | 
| 
                   | unsupported | 
| 
                   | unsupported | 
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | 
Boost.MPI will provide complete support for creating communicators with different topologies and later querying those topologies. Support for graph topologies is provided via an interface to the Boost Graph Library (BGL), where a communicator can be created which matches the structure of any BGL graph, and the graph topology of a communicator can be viewed as a BGL graph for use in existing, generic graph algorithms.
Table 20.12. Process topologies
| C Function/Constant | Boost.MPI Equivalent | 
|---|---|
| 
                   | 
                  unnecessary; use  | 
| 
                   | 
                  unnecessary; use  | 
| unsupported | |
| unsupported | |
| 
                   | |
| 
                   | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | |
| unsupported | 
        Boost.MPI supports environmental inquires through the environment class.
      
Table 20.13. Environmental inquiries
| C Function/Constant | Boost.MPI Equivalent | 
|---|---|
| 
                   | 
                  unnecessary; use  | 
| 
                   | 
                  unnecessary; use  | 
| 
                   | 
                  unnecessary; use  | 
        Boost.MPI translates MPI errors into exceptions, reported via the exception
        class.
      
Table 20.14. Error handling
| C Function/Constant | Boost.MPI Equivalent | 
|---|---|
| 
                   | unused; errors are translated into Boost.MPI exceptions | 
| 
                   | unused; errors are translated into Boost.MPI exceptions | 
| unused; errors are translated into Boost.MPI exceptions | |
| unused; errors are translated into Boost.MPI exceptions | |
| unused; errors are translated into Boost.MPI exceptions | |
| unused; errors are translated into Boost.MPI exceptions | |
| used internally by Boost.MPI | |
        The MPI timing facilities are exposed via the Boost.MPI timer class, which provides
        an interface compatible with the Boost
        Timer library.
      
Table 20.15. Timing facilities
| C Function/Constant | Boost.MPI Equivalent | 
|---|---|
| 
                   | 
                  unnecessary; use  | 
| 
                  use  | |
        MPI startup and shutdown are managed by the construction and descruction
        of the Boost.MPI environment
        class.
      
Table 20.16. Startup/shutdown facilities
| C Function | Boost.MPI Equivalent | 
|---|---|
| 
                   | |
| 
                   | |
Boost.MPI does not provide any support for the profiling facilities in MPI 1.1.