TIMPI
Public Types | Public Member Functions | Private Member Functions | Private Attributes | List of all members
TIMPI::Communicator Class Reference

Encapsulates the MPI_Comm object. More...

#include <communicator.h>

Inheritance diagram for TIMPI::Communicator:
[legend]

Public Types

enum  SendMode { DEFAULT =0, SYNCHRONOUS }
 Whether to use default or synchronous sends? More...
 
enum  SyncType { NBX, ALLTOALL_COUNTS, SENDRECEIVE }
 What algorithm to use for parallel synchronization? More...
 

Public Member Functions

 Communicator ()
 Default Constructor. More...
 
 Communicator (const communicator &comm)
 
 Communicator (const Communicator &)=delete
 
Communicatoroperator= (const Communicator &)=delete
 
 Communicator (Communicator &&)=default
 
Communicatoroperator= (Communicator &&)=default
 
 ~Communicator ()
 
void split (int color, int key, Communicator &target) const
 
void split_by_type (int split_type, int key, info i, Communicator &target) const
 
void duplicate (const Communicator &comm)
 
void duplicate (const communicator &comm)
 
communicatorget ()
 
const communicatorget () const
 
MessageTag get_unique_tag (int tagvalue=MessageTag::invalid_tag) const
 Get a tag that is unique to this Communicator. More...
 
void reference_unique_tag (int tagvalue) const
 Reference an already-acquired tag, so that we know it will be dereferenced multiple times before we can re-release it. More...
 
void dereference_unique_tag (int tagvalue) const
 Dereference an already-acquired tag, and see if we can re-release it. More...
 
void clear ()
 Free and reset this communicator. More...
 
Communicatoroperator= (const communicator &comm)
 
processor_id_type rank () const
 
processor_id_type size () const
 
void send_mode (const SendMode sm)
 Explicitly sets the SendMode type used for send operations. More...
 
SendMode send_mode () const
 Gets the user-requested SendMode. More...
 
void sync_type (const SyncType st)
 Explicitly sets the SyncType used for sync operations. More...
 
void sync_type (const std::string &st)
 Sets the sync type used for sync operations via a string. More...
 
SyncType sync_type () const
 Gets the user-requested SyncType. More...
 
void barrier () const
 Pause execution until all processors reach a certain point. More...
 
void nonblocking_barrier (Request &req) const
 Start a barrier that doesn't block. More...
 
template<typename T >
bool verify (const T &r) const
 Verify that a local variable has the same value on all processors. More...
 
template<typename T >
bool semiverify (const T *r) const
 Verify that a local pointer points to the same value on all processors where it is not nullptr. More...
 
template<typename T >
void min (const T &r, T &o, Request &req) const
 Non-blocking minimum of the local value r into o with the request req. More...
 
template<typename T >
void min (T &r) const
 Take a local variable and replace it with the minimum of it's values on all processors. More...
 
template<typename T >
void minloc (T &r, unsigned int &min_id) const
 Take a local variable and replace it with the minimum of it's values on all processors, returning the minimum rank of a processor which originally held the minimum value. More...
 
template<typename T , typename A1 , typename A2 >
void minloc (std::vector< T, A1 > &r, std::vector< unsigned int, A2 > &min_id) const
 Take a vector of local variables and replace each entry with the minimum of it's values on all processors. More...
 
template<typename T >
void max (const T &r, T &o, Request &req) const
 Non-blocking maximum of the local value r into o with the request req. More...
 
template<typename T >
void max (T &r) const
 Take a local variable and replace it with the maximum of it's values on all processors. More...
 
template<typename T >
void maxloc (T &r, unsigned int &max_id) const
 Take a local variable and replace it with the maximum of it's values on all processors, returning the minimum rank of a processor which originally held the maximum value. More...
 
template<typename T , typename A1 , typename A2 >
void maxloc (std::vector< T, A1 > &r, std::vector< unsigned int, A2 > &max_id) const
 Take a vector of local variables and replace each entry with the maximum of it's values on all processors. More...
 
template<typename T >
void sum (T &r) const
 Take a local variable and replace it with the sum of it's values on all processors. More...
 
template<typename T >
void sum (const T &r, T &o, Request &req) const
 Non-blocking sum of the local value r into o with the request req. More...
 
template<typename T >
void set_union (T &data, const unsigned int root_id) const
 Take a container (set, map, unordered_set, multimap, etc) of local variables on each processor, and collect their union over all processors, replacing the original on processor 0. More...
 
template<typename T >
void set_union (T &data) const
 Take a container of local variables on each processor, and replace it with their union over all processors, replacing the original on all processors. More...
 
status probe (const unsigned int src_processor_id, const MessageTag &tag=any_tag) const
 Blocking message probe. More...
 
template<typename T >
Status packed_range_probe (const unsigned int src_processor_id, const MessageTag &tag, bool &flag) const
 Non-Blocking message probe for a packed range message. More...
 
template<typename T >
void send (const unsigned int dest_processor_id, const T &buf, const MessageTag &tag=no_tag) const
 Blocking-send to one processor with data-defined type. More...
 
template<typename T >
void send (const unsigned int dest_processor_id, const T &buf, Request &req, const MessageTag &tag=no_tag) const
 Nonblocking-send to one processor with data-defined type. More...
 
template<typename T >
void send (const unsigned int dest_processor_id, const T &buf, const DataType &type, const MessageTag &tag=no_tag) const
 Blocking-send to one processor with user-defined type. More...
 
template<typename T >
void send (const unsigned int dest_processor_id, const T &buf, const DataType &type, Request &req, const MessageTag &tag=no_tag) const
 Nonblocking-send to one processor with user-defined type. More...
 
template<typename T >
void send (const unsigned int dest_processor_id, const T &buf, const NotADataType &type, Request &req, const MessageTag &tag=no_tag) const
 Nonblocking-send to one processor with user-defined packable type. More...
 
template<typename T >
Status receive (const unsigned int dest_processor_id, T &buf, const MessageTag &tag=any_tag) const
 Blocking-receive from one processor with data-defined type. More...
 
template<typename T >
void receive (const unsigned int dest_processor_id, T &buf, Request &req, const MessageTag &tag=any_tag) const
 Nonblocking-receive from one processor with data-defined type. More...
 
template<typename T >
Status receive (const unsigned int dest_processor_id, T &buf, const DataType &type, const MessageTag &tag=any_tag) const
 Blocking-receive from one processor with user-defined type. More...
 
template<typename T >
void receive (const unsigned int dest_processor_id, T &buf, const DataType &type, Request &req, const MessageTag &tag=any_tag) const
 Nonblocking-receive from one processor with user-defined type. More...
 
template<typename T , typename A , typename std::enable_if< std::is_base_of< DataType, StandardType< T >>::value, int >::type = 0>
bool possibly_receive (unsigned int &src_processor_id, std::vector< T, A > &buf, Request &req, const MessageTag &tag) const
 Nonblocking-receive from one processor with user-defined type. More...
 
template<typename T , typename A , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type = 0>
bool possibly_receive (unsigned int &src_processor_id, std::vector< T, A > &buf, Request &req, const MessageTag &tag) const
 dispatches to possibly_receive_packed_range More...
 
template<typename T , typename A , typename std::enable_if< std::is_base_of< DataType, StandardType< T >>::value, int >::type = 0>
bool possibly_receive (unsigned int &src_processor_id, std::vector< T, A > &buf, const DataType &type, Request &req, const MessageTag &tag) const
 Nonblocking-receive from one processor with user-defined type. More...
 
template<typename T , typename A , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type = 0>
bool possibly_receive (unsigned int &src_processor_id, std::vector< T, A > &buf, const NotADataType &type, Request &req, const MessageTag &tag) const
 Nonblocking-receive from one processor with user-defined type. More...
 
template<typename Context , typename OutputIter , typename T >
bool possibly_receive_packed_range (unsigned int &src_processor_id, Context *context, OutputIter out, const T *output_type, Request &req, const MessageTag &tag) const
 Nonblocking packed range receive from one processor with user-defined type. More...
 
template<typename Context , typename Iter >
void send_packed_range (const unsigned int dest_processor_id, const Context *context, Iter range_begin, const Iter range_end, const MessageTag &tag=no_tag, std::size_t approx_buffer_size=1000000) const
 Blocking-send range-of-pointers to one processor. More...
 
template<typename Context , typename Iter >
void send_packed_range (const unsigned int dest_processor_id, const Context *context, Iter range_begin, const Iter range_end, Request &req, const MessageTag &tag=no_tag, std::size_t approx_buffer_size=1000000) const
 Nonblocking-send range-of-pointers to one processor. More...
 
template<typename Context , typename Iter >
void nonblocking_send_packed_range (const unsigned int dest_processor_id, const Context *context, Iter range_begin, const Iter range_end, Request &req, const MessageTag &tag=no_tag) const
 Similar to the above Nonblocking send_packed_range with a few important differences: More...
 
template<typename Context , typename Iter >
void nonblocking_send_packed_range (const unsigned int dest_processor_id, const Context *context, Iter range_begin, const Iter range_end, Request &req, std::shared_ptr< std::vector< typename TIMPI::Packing< typename std::iterator_traits< Iter >::value_type >::buffer_type >> &buffer, const MessageTag &tag=no_tag) const
 Similar to the above Nonblocking send_packed_range with a few important differences: More...
 
template<typename Context , typename OutputIter , typename T >
void receive_packed_range (const unsigned int dest_processor_id, Context *context, OutputIter out, const T *output_type, const MessageTag &tag=any_tag) const
 Blocking-receive range-of-pointers from one processor. More...
 
template<typename Context , typename OutputIter , typename T >
void nonblocking_receive_packed_range (const unsigned int src_processor_id, Context *context, OutputIter out, const T *output_type, Request &req, Status &stat, const MessageTag &tag=any_tag) const
 Non-Blocking-receive range-of-pointers from one processor. More...
 
template<typename Context , typename OutputIter , typename T >
void nonblocking_receive_packed_range (const unsigned int src_processor_id, Context *context, OutputIter out, const T *output_type, Request &req, Status &stat, std::shared_ptr< std::vector< typename TIMPI::Packing< T >::buffer_type >> &buffer, const MessageTag &tag=any_tag) const
 Non-Blocking-receive range-of-pointers from one processor. More...
 
template<typename T1 , typename T2 , typename std::enable_if< std::is_base_of< DataType, StandardType< T1 >>::value &&std::is_base_of< DataType, StandardType< T2 >>::value, int >::type = 0>
void send_receive (const unsigned int dest_processor_id, const T1 &send_data, const unsigned int source_processor_id, T2 &recv_data, const MessageTag &send_tag=no_tag, const MessageTag &recv_tag=any_tag) const
 Send data send to one processor while simultaneously receiving other data recv from a (potentially different) processor. More...
 
template<typename Context1 , typename RangeIter , typename Context2 , typename OutputIter , typename T >
void send_receive_packed_range (const unsigned int dest_processor_id, const Context1 *context1, RangeIter send_begin, const RangeIter send_end, const unsigned int source_processor_id, Context2 *context2, OutputIter out, const T *output_type, const MessageTag &send_tag=no_tag, const MessageTag &recv_tag=any_tag, std::size_t approx_buffer_size=1000000) const
 Send a range-of-pointers to one processor while simultaneously receiving another range from a (potentially different) processor. More...
 
template<typename T1 , typename T2 >
void send_receive (const unsigned int dest_processor_id, const T1 &send_data, const DataType &type1, const unsigned int source_processor_id, T2 &recv_data, const DataType &type2, const MessageTag &send_tag=no_tag, const MessageTag &recv_tag=any_tag) const
 Send data send to one processor while simultaneously receiving other data recv from a (potentially different) processor, using a user-specified MPI Dataype. More...
 
template<typename T , typename A >
void gather (const unsigned int root_id, const T &send_data, std::vector< T, A > &recv) const
 Take a vector of length comm.size(), and on processor root_id fill in recv[processor_id] = the value of send on processor processor_id. More...
 
template<typename T , typename A >
void gather (const unsigned int root_id, const std::basic_string< T > &send_data, std::vector< std::basic_string< T >, A > &recv_data, const bool identical_buffer_sizes=false) const
 The gather overload for string types has an optional identical_buffer_sizes optimization for when all strings are the same length. More...
 
template<typename T , typename A , typename std::enable_if< std::is_base_of< DataType, StandardType< T >>::value, int >::type = 0>
void gather (const unsigned int root_id, std::vector< T, A > &r) const
 
Take a vector of local variables and expand it on processor root_id to include values from all processors More...
 
template<typename T , typename A , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type = 0>
void gather (const unsigned int root_id, std::vector< T, A > &r) const
 
template<typename T , typename A , typename std::enable_if< std::is_base_of< DataType, StandardType< T >>::value, int >::type = 0>
void allgather (const T &send_data, std::vector< T, A > &recv_data) const
 Take a vector of length this->size(), and fill in recv[processor_id] = the value of send on that processor. More...
 
template<typename T , typename A , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type = 0>
void allgather (const T &send_data, std::vector< T, A > &recv_data) const
 Take a vector of length this->size(), and fill in recv[processor_id] = the value of send on that processor. More...
 
template<typename T , typename A >
void allgather (const std::basic_string< T > &send_data, std::vector< std::basic_string< T >, A > &recv_data, const bool identical_buffer_sizes=false) const
 The allgather overload for string types has an optional identical_buffer_sizes optimization for when all strings are the same length. More...
 
template<typename T , typename A , typename std::enable_if< std::is_base_of< DataType, StandardType< T >>::value, int >::type = 0>
void allgather (std::vector< T, A > &r, const bool identical_buffer_sizes=false) const
 
Take a vector of fixed size local variables and expand it to include values from all processors. More...
 
template<typename T , typename A1 , typename A2 , typename std::enable_if< std::is_base_of< DataType, StandardType< T >>::value, int >::type = 0>
void allgather (const std::vector< T, A1 > &send_data, std::vector< std::vector< T, A1 >, A2 > &recv_data, const bool identical_buffer_sizes=false) const
 Take a vector of fixed size local variables and collect similar vectors from all processors. More...
 
template<typename T , typename A1 , typename A2 , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type = 0>
void allgather (const std::vector< T, A1 > &send_data, std::vector< std::vector< T, A1 >, A2 > &recv_data, const bool identical_buffer_sizes=false) const
 Take a vector of dynamic-size local variables and collect similar vectors from all processors. More...
 
template<typename T , typename A , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type = 0>
void allgather (std::vector< T, A > &r, const bool identical_buffer_sizes=false) const
 
Take a vector of possibly dynamically sized local variables and expand it to include values from all processors. More...
 
template<typename T , typename A >
void allgather (std::vector< std::basic_string< T >, A > &r, const bool identical_buffer_sizes=false) const
 AllGather overload for vectors of string types. More...
 
template<typename T , typename A >
void scatter (const std::vector< T, A > &data, T &recv, const unsigned int root_id=0) const
 Take a vector of local variables and scatter the ith item to the ith processor in the communicator. More...
 
template<typename T , typename A >
void scatter (const std::vector< T, A > &data, std::vector< T, A > &recv, const unsigned int root_id=0) const
 Take a vector of local variables and scatter the ith equal-sized chunk to the ith processor in the communicator. More...
 
template<typename T , typename A1 , typename A2 >
void scatter (const std::vector< T, A1 > &data, const std::vector< int, A2 > counts, std::vector< T, A1 > &recv, const unsigned int root_id=0) const
 Take a vector of local variables and scatter the ith variable-sized chunk to the ith processor in the communicator. More...
 
template<typename T , typename A1 , typename A2 >
void scatter (const std::vector< std::vector< T, A1 >, A2 > &data, std::vector< T, A1 > &recv, const unsigned int root_id=0, const bool identical_buffer_sizes=false) const
 Take a vector of vectors and scatter the ith inner vector to the ith processor in the communicator. More...
 
template<typename Context , typename Iter , typename OutputIter >
void gather_packed_range (const unsigned int root_id, Context *context, Iter range_begin, const Iter range_end, OutputIter out, std::size_t approx_buffer_size=1000000) const
 Take a range of local variables, combine it with ranges from all processors, and write the output to the output iterator on rank root. More...
 
template<typename Context , typename Iter , typename OutputIter >
void allgather_packed_range (Context *context, Iter range_begin, const Iter range_end, OutputIter out, std::size_t approx_buffer_size=1000000) const
 Take a range of local variables, combine it with ranges from all processors, and write the output to the output iterator. More...
 
template<typename T , typename A >
void alltoall (std::vector< T, A > &r) const
 Effectively transposes the input vector across all processors. More...
 
template<typename T #ifdef TIMPI_HAVE_MPI, typename std::enable_if< std::is_base_of< DataType, StandardType< T >>::value, int >::type = 0 #endif>
void broadcast (T &data, const unsigned int root_id=0, const bool identical_sizes=false) const
 Take a local value and broadcast it to all processors. More...
 
template<typename T , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type = 0>
void broadcast (T &data, const unsigned int root_id=0, const bool identical_sizes=false) const
 Take a possibly dynamically-sized local value and broadcast it to all processors. More...
 
template<typename Context , typename OutputContext , typename Iter , typename OutputIter >
void broadcast_packed_range (const Context *context1, Iter range_begin, const Iter range_end, OutputContext *context2, OutputIter out, const unsigned int root_id=0, std::size_t approx_buffer_size=1000000) const
 Blocking-broadcast range-of-pointers to one processor. More...
 
template<typename T >
void send (const unsigned int dest_processor_id, const std::basic_string< T > &buf, const MessageTag &tag) const
 
template<typename T >
void send (const unsigned int dest_processor_id, const std::basic_string< T > &buf, Request &req, const MessageTag &tag) const
 
template<typename T >
Status receive (const unsigned int src_processor_id, std::basic_string< T > &buf, const MessageTag &tag) const
 
template<typename T >
void receive (const unsigned int src_processor_id, std::basic_string< T > &buf, Request &req, const MessageTag &tag) const
 
template<typename T , typename A , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type >
void send_receive (const unsigned int dest_processor_id, const std::vector< T, A > &send_data, const unsigned int source_processor_id, std::vector< T, A > &recv_data, const MessageTag &send_tag, const MessageTag &recv_tag) const
 
template<typename T , typename A1 , typename A2 >
void send_receive (const unsigned int dest_processor_id, const std::vector< std::vector< T, A1 >, A2 > &sendvec, const unsigned int source_processor_id, std::vector< std::vector< T, A1 >, A2 > &recv, const MessageTag &send_tag, const MessageTag &recv_tag) const
 
template<typename Context , typename Iter >
void nonblocking_send_packed_range (const unsigned int dest_processor_id, const Context *context, Iter range_begin, const Iter range_end, Request &req, std::shared_ptr< std::vector< typename Packing< typename std::iterator_traits< Iter >::value_type >::buffer_type >> &buffer, const MessageTag &tag) const
 
template<typename T >
void broadcast (std::basic_string< T > &data, const unsigned int root_id, const bool identical_sizes) const
 
template<typename T , typename A >
void broadcast (std::vector< std::basic_string< T >, A > &data, const unsigned int root_id, const bool identical_sizes) const
 
template<typename Context , typename OutputIter , typename T >
void nonblocking_receive_packed_range (const unsigned int src_processor_id, Context *context, OutputIter out, const T *, Request &req, Status &stat, std::shared_ptr< std::vector< typename Packing< T >::buffer_type >> &buffer, const MessageTag &tag) const
 
template<typename T , typename A1 , typename A2 >
bool possibly_receive (unsigned int &src_processor_id, std::vector< std::vector< T, A1 >, A2 > &buf, const DataType &type, Request &req, const MessageTag &tag) const
 
template<typename T >
void min (T &timpi_mpi_var(r)) const
 
template<typename A >
void min (std::vector< bool, A > &r) const
 
template<typename A1 , typename A2 >
void minloc (std::vector< bool, A1 > &r, std::vector< unsigned int, A2 > &min_id) const
 
template<typename T >
void max (T &timpi_mpi_var(r)) const
 
template<typename A >
void max (std::vector< bool, A > &r) const
 
template<typename A1 , typename A2 >
void maxloc (std::vector< bool, A1 > &r, std::vector< unsigned int, A2 > &max_id) const
 
template<typename T >
void sum (T &timpi_mpi_var(r)) const
 
template<typename T >
void sum (std::complex< T > &timpi_mpi_var(r)) const
 
template<typename T #ifdef TIMPI_HAVE_MPI, typename std::enable_if< std::is_base_of< DataType, StandardType< T >>::value, int >::type # endif>
void broadcast (T &timpi_mpi_var(data), const unsigned int root_id, const bool) const
 
template<typename Map , typename std::enable_if< std::is_base_of< DataType, StandardType< typename Map::key_type >>::value &&std::is_base_of< DataType, StandardType< typename Map::mapped_type >>::value, int >::type >
void map_broadcast (Map &timpi_mpi_var(data), const unsigned int root_id, const bool timpi_mpi_var(identical_sizes)) const
 
template<typename Map >
void map_broadcast (Map &timpi_mpi_var(data), const unsigned int root_id, const bool timpi_mpi_var(identical_sizes)) const
 
template<typename T , typename A1 , typename A2 >
bool possibly_receive (unsigned int &src_processor_id, std::vector< std::vector< T, A1 >, A2 > &buf, Request &req, const MessageTag &tag) const
 
template<typename T1 , typename T2 , typename std::enable_if< std::is_base_of< DataType, StandardType< T1 >>::value &&std::is_base_of< DataType, StandardType< T2 >>::value, int >::type >
void send_receive (const unsigned int timpi_dbg_var(send_tgt), const T1 &send_val, const unsigned int timpi_dbg_var(recv_source), T2 &recv_val, const MessageTag &, const MessageTag &) const
 Send-receive data from one processor. More...
 
template<typename T , typename A , typename std::enable_if< std::is_base_of< DataType, StandardType< T >>::value, int >::type >
void send_receive (const unsigned int timpi_dbg_var(dest_processor_id), const std::vector< T, A > &send, const unsigned int timpi_dbg_var(source_processor_id), std::vector< T, A > &recv, const MessageTag &, const MessageTag &) const
 
template<typename T , typename A1 , typename A2 >
void send_receive (const unsigned int timpi_dbg_var(dest_processor_id), const std::vector< std::vector< T, A1 >, A2 > &send, const unsigned int timpi_dbg_var(source_processor_id), std::vector< std::vector< T, A1 >, A2 > &recv, const MessageTag &, const MessageTag &) const
 
template<typename Context1 , typename RangeIter , typename Context2 , typename OutputIter , typename T >
void send_receive_packed_range (const unsigned int timpi_dbg_var(dest_processor_id), const Context1 *context1, RangeIter send_begin, const RangeIter send_end, const unsigned int timpi_dbg_var(source_processor_id), Context2 *context2, OutputIter out_iter, const T *output_type, const MessageTag &, const MessageTag &, std::size_t) const
 Send-receive range-of-pointers from one processor. More...
 

Private Member Functions

void assign (const communicator &comm)
 Utility function for setting our member variables from an MPI communicator. More...
 
template<typename Map , typename std::enable_if< std::is_base_of< DataType, StandardType< typename Map::key_type >>::value &&std::is_base_of< DataType, StandardType< typename Map::mapped_type >>::value, int >::type = 0>
void map_sum (Map &data) const
 Private implementation function called by the map-based sum() specializations. More...
 
template<typename Map >
void map_sum (Map &data) const
 Private implementation function called by the map-based sum() specializations. More...
 
template<typename Map , typename std::enable_if< std::is_base_of< DataType, StandardType< typename Map::key_type >>::value &&std::is_base_of< DataType, StandardType< typename Map::mapped_type >>::value, int >::type = 0>
void map_broadcast (Map &data, const unsigned int root_id, const bool identical_sizes) const
 Private implementation function called by the map-based broadcast() specializations. More...
 
template<typename Map >
void map_broadcast (Map &data, const unsigned int root_id, const bool identical_sizes) const
 Private implementation function called by the map-based broadcast() specializations. More...
 
template<typename Map , typename std::enable_if< std::is_base_of< DataType, StandardType< typename Map::key_type >>::value &&std::is_base_of< DataType, StandardType< typename Map::mapped_type >>::value, int >::type = 0>
void map_max (Map &data) const
 Private implementation function called by the map-based max() specializations. More...
 
template<typename Map >
void map_max (Map &data) const
 Private implementation function called by the map-based max() specializations. More...
 
template<typename T , typename A1 , typename A2 >
int packed_size_of (const std::vector< std::vector< T, A1 >, A2 > &buf, const DataType &type) const
 

Private Attributes

communicator _communicator
 
processor_id_type _rank
 
processor_id_type _size
 
SendMode _send_mode
 
SyncType _sync_type
 
std::map< int, unsigned int > used_tag_values
 
int _next_tag
 
int _max_tag
 
bool _I_duped_it
 

Detailed Description

Encapsulates the MPI_Comm object.

Allows the size of the group and this process's position in the group to be determined.

Methods of this object are the preferred way to perform distributed-memory parallel operations.

Definition at line 106 of file communicator.h.

Member Enumeration Documentation

◆ SendMode

Whether to use default or synchronous sends?

Enumerator
DEFAULT 
SYNCHRONOUS 

Definition at line 213 of file communicator.h.

◆ SyncType

What algorithm to use for parallel synchronization?

Enumerator
NBX 
ALLTOALL_COUNTS 
SENDRECEIVE 

Definition at line 218 of file communicator.h.

Constructor & Destructor Documentation

◆ Communicator() [1/4]

TIMPI::Communicator::Communicator ( )

Default Constructor.

Definition at line 59 of file communicator.C.

59  :
60 #ifdef TIMPI_HAVE_MPI
61  _communicator(MPI_COMM_SELF),
62 #endif
63  _rank(0),
64  _size(1),
66  _sync_type(NBX),
68  _next_tag(0),
69  _max_tag(std::numeric_limits<int>::max()),
70  _I_duped_it(false) {}
communicator _communicator
Definition: communicator.h:229
std::map< int, unsigned int > used_tag_values
Definition: communicator.h:236
processor_id_type _size
Definition: communicator.h:230
processor_id_type _rank
Definition: communicator.h:230

◆ Communicator() [2/4]

TIMPI::Communicator::Communicator ( const communicator comm)
explicit

Definition at line 73 of file communicator.C.

References assign().

73  :
74 #ifdef TIMPI_HAVE_MPI
75  _communicator(MPI_COMM_SELF),
76 #endif
77  _rank(0),
78  _size(1),
80  _sync_type(NBX),
82  _next_tag(0),
83  _max_tag(std::numeric_limits<int>::max()),
84  _I_duped_it(false)
85 {
86  this->assign(comm);
87 }
communicator _communicator
Definition: communicator.h:229
std::map< int, unsigned int > used_tag_values
Definition: communicator.h:236
void assign(const communicator &comm)
Utility function for setting our member variables from an MPI communicator.
Definition: communicator.C:185
processor_id_type _size
Definition: communicator.h:230
processor_id_type _rank
Definition: communicator.h:230

◆ Communicator() [3/4]

TIMPI::Communicator::Communicator ( const Communicator )
delete

◆ Communicator() [4/4]

TIMPI::Communicator::Communicator ( Communicator &&  )
default

◆ ~Communicator()

TIMPI::Communicator::~Communicator ( )

Definition at line 90 of file communicator.C.

References clear().

91 {
92  this->clear();
93 }
void clear()
Free and reset this communicator.
Definition: communicator.C:162

Member Function Documentation

◆ allgather() [1/8]

template<typename T , typename A , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type >
void TIMPI::Communicator::allgather ( const T &  send_data,
std::vector< T, A > &  recv_data 
) const
inline

Take a vector of length this->size(), and fill in recv[processor_id] = the value of send on that processor.

This overload works on fixed size types

Definition at line 3193 of file parallel_implementation.h.

References size().

Referenced by allgather(), allgather_packed_range(), gather(), map_max(), map_sum(), testAllGather(), testAllGatherEmptyVectorString(), testAllGatherHalfEmptyVectorString(), testAllGatherString(), testAllGatherVectorString(), testAllGatherVectorVector(), testAllGatherVectorVectorPacked(), testArrayContainerAllGather(), testContainerAllGather(), testMapContainerAllGather(), testPairContainerAllGather(), testTupleContainerAllGather(), and testVectorOfContainersAllGather().

3195 {
3196  TIMPI_LOG_SCOPE ("allgather()","Parallel");
3197 
3198  timpi_assert(this->size());
3199  recv.resize(this->size());
3200 
3201  unsigned int comm_size = this->size();
3202  if (comm_size > 1)
3203  {
3204  StandardType<T> send_type(&sendval);
3205 
3206  timpi_call_mpi
3207  (MPI_Allgather (const_cast<T*>(&sendval), 1, send_type, recv.data(), 1,
3208  send_type, this->get()));
3209  }
3210  else if (comm_size > 0)
3211  recv[0] = sendval;
3212 }
processor_id_type size() const
Definition: communicator.h:208

◆ allgather() [2/8]

template<typename T , typename A , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type = 0>
void TIMPI::Communicator::allgather ( const T &  send_data,
std::vector< T, A > &  recv_data 
) const
inline

Take a vector of length this->size(), and fill in recv[processor_id] = the value of send on that processor.

This overload works on potentially dynamically sized types, and dispatches to allgather_packed_range

◆ allgather() [3/8]

template<typename T , typename A >
void TIMPI::Communicator::allgather ( const std::basic_string< T > &  send_data,
std::vector< std::basic_string< T >, A > &  recv_data,
const bool  identical_buffer_sizes = false 
) const
inline

The allgather overload for string types has an optional identical_buffer_sizes optimization for when all strings are the same length.

Definition at line 1586 of file parallel_implementation.h.

References allgather(), and size().

1589 {
1590  TIMPI_LOG_SCOPE ("allgather()","Parallel");
1591 
1592  timpi_assert(this->size());
1593  recv.assign(this->size(), "");
1594 
1595  // serial case
1596  if (this->size() < 2)
1597  {
1598  recv.resize(1);
1599  recv[0] = sendval;
1600  return;
1601  }
1602 
1603  std::vector<int>
1604  sendlengths (this->size(), 0),
1605  displacements(this->size(), 0);
1606 
1607  const int mysize = static_cast<int>(sendval.size());
1608 
1609  if (identical_buffer_sizes)
1610  sendlengths.assign(this->size(), mysize);
1611  else
1612  // first comm step to determine buffer sizes from all processors
1613  this->allgather(mysize, sendlengths);
1614 
1615  // Find the total size of the final array and
1616  // set up the displacement offsets for each processor
1617  unsigned int globalsize = 0;
1618  for (unsigned int i=0; i != this->size(); ++i)
1619  {
1620  displacements[i] = globalsize;
1621  globalsize += sendlengths[i];
1622  }
1623 
1624  // Check for quick return
1625  if (globalsize == 0)
1626  return;
1627 
1628  // monolithic receive buffer
1629  std::basic_string<T> r(globalsize, 0);
1630 
1631  // and get the data from the remote processors.
1632  timpi_call_mpi
1633  (MPI_Allgatherv (const_cast<T*>(mysize ? sendval.data() : nullptr),
1634  mysize, StandardType<T>(),
1635  &r[0], sendlengths.data(), displacements.data(),
1636  StandardType<T>(), this->get()));
1637 
1638  // slice receive buffer up
1639  for (unsigned int i=0; i != this->size(); ++i)
1640  recv[i] = r.substr(displacements[i], sendlengths[i]);
1641 }
void allgather(const T &send_data, std::vector< T, A > &recv_data) const
Take a vector of length this->size(), and fill in recv[processor_id] = the value of send on that proc...
processor_id_type size() const
Definition: communicator.h:208

◆ allgather() [4/8]

template<typename T , typename A , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type >
void TIMPI::Communicator::allgather ( std::vector< T, A > &  r,
const bool  identical_buffer_sizes = false 
) const
inline


Take a vector of fixed size local variables and expand it to include values from all processors.

By default, each processor is allowed to have its own unique input buffer length. If it is known that all processors have the same input sizes additional communication can be avoided.

Specifically, this function transforms this:

* Processor 0: [ ... N_0 ]
* Processor 1: [ ....... N_1 ]
* ...
* Processor M: [ .. N_M]
* 

into this:

* [ [ ... N_0 ] [ ....... N_1 ] ... [ .. N_M] ]
* 

on each processor. This function is collective and therefore must be called by all processors in the Communicator.

Definition at line 3242 of file parallel_implementation.h.

References allgather(), size(), and verify().

3244 {
3245  if (this->size() < 2)
3246  return;
3247 
3248  TIMPI_LOG_SCOPE("allgather()", "Parallel");
3249 
3250  if (identical_buffer_sizes)
3251  {
3252  timpi_assert(this->verify(r.size()));
3253  if (r.empty())
3254  return;
3255 
3256  std::vector<T,A> r_src(r.size()*this->size());
3257  r_src.swap(r);
3258  StandardType<T> send_type(r_src.data());
3259 
3260  timpi_call_mpi
3261  (MPI_Allgather (r_src.data(), cast_int<int>(r_src.size()),
3262  send_type, r.data(), cast_int<int>(r_src.size()),
3263  send_type, this->get()));
3264  // timpi_assert(this->verify(r));
3265  return;
3266  }
3267 
3268  std::vector<int>
3269  sendlengths (this->size(), 0),
3270  displacements(this->size(), 0);
3271 
3272  const int mysize = static_cast<int>(r.size());
3273  this->allgather(mysize, sendlengths);
3274 
3275  // Find the total size of the final array and
3276  // set up the displacement offsets for each processor.
3277  unsigned int globalsize = 0;
3278  for (unsigned int i=0; i != this->size(); ++i)
3279  {
3280  displacements[i] = globalsize;
3281  globalsize += sendlengths[i];
3282  }
3283 
3284  // Check for quick return
3285  if (globalsize == 0)
3286  return;
3287 
3288  // copy the input buffer
3289  std::vector<T,A> r_src(globalsize);
3290  r_src.swap(r);
3291 
3292  StandardType<T> send_type(r.data());
3293 
3294  // and get the data from the remote processors.
3295  // Pass nullptr if our vector is empty.
3296  timpi_call_mpi
3297  (MPI_Allgatherv (r_src.empty() ? nullptr : r_src.data(), mysize,
3298  send_type, r.data(), sendlengths.data(),
3299  displacements.data(), send_type, this->get()));
3300 }
bool verify(const T &r) const
Verify that a local variable has the same value on all processors.
void allgather(const T &send_data, std::vector< T, A > &recv_data) const
Take a vector of length this->size(), and fill in recv[processor_id] = the value of send on that proc...
processor_id_type size() const
Definition: communicator.h:208

◆ allgather() [5/8]

template<typename T , typename A1 , typename A2 , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type >
void TIMPI::Communicator::allgather ( const std::vector< T, A1 > &  send_data,
std::vector< std::vector< T, A1 >, A2 > &  recv_data,
const bool  identical_buffer_sizes = false 
) const
inline

Take a vector of fixed size local variables and collect similar vectors from all processors.

By default, each processor is allowed to have its own unique input buffer length. If it is known that all processors have the same input sizes additional communication can be avoided.

Definition at line 2804 of file parallel_implementation.h.

References allgather(), assign(), and size().

2807 {
2808  TIMPI_LOG_SCOPE ("allgather()","Parallel");
2809 
2810  timpi_assert(this->size());
2811 
2812  // serial case
2813  if (this->size() < 2)
2814  {
2815  recv.resize(1);
2816  recv[0] = sendval;
2817  return;
2818  }
2819 
2820  recv.clear();
2821  recv.resize(this->size());
2822 
2823  std::vector<int>
2824  sendlengths (this->size(), 0),
2825  displacements(this->size(), 0);
2826 
2827  const int mysize = static_cast<int>(sendval.size());
2828 
2829  if (identical_buffer_sizes)
2830  sendlengths.assign(this->size(), mysize);
2831  else
2832  // first comm step to determine buffer sizes from all processors
2833  this->allgather(mysize, sendlengths);
2834 
2835  // Find the total size of the final array and
2836  // set up the displacement offsets for each processor
2837  unsigned int globalsize = 0;
2838  for (unsigned int i=0; i != this->size(); ++i)
2839  {
2840  displacements[i] = globalsize;
2841  globalsize += sendlengths[i];
2842  }
2843 
2844  // Check for quick return
2845  if (globalsize == 0)
2846  return;
2847 
2848  // monolithic receive buffer
2849  std::vector<T,A1> r(globalsize, 0);
2850 
2851  // and get the data from the remote processors.
2852  timpi_call_mpi
2853  (MPI_Allgatherv (const_cast<T*>(mysize ? sendval.data() : nullptr),
2854  mysize, StandardType<T>(),
2855  &r[0], sendlengths.data(), displacements.data(),
2856  StandardType<T>(), this->get()));
2857 
2858  // slice receive buffer up
2859  for (unsigned int i=0; i != this->size(); ++i)
2860  recv[i].assign(r.begin()+displacements[i],
2861  r.begin()+displacements[i]+sendlengths[i]);
2862 }
void allgather(const T &send_data, std::vector< T, A > &recv_data) const
Take a vector of length this->size(), and fill in recv[processor_id] = the value of send on that proc...
void assign(const communicator &comm)
Utility function for setting our member variables from an MPI communicator.
Definition: communicator.C:185
processor_id_type size() const
Definition: communicator.h:208

◆ allgather() [6/8]

template<typename T , typename A1 , typename A2 , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type = 0>
void TIMPI::Communicator::allgather ( const std::vector< T, A1 > &  send_data,
std::vector< std::vector< T, A1 >, A2 > &  recv_data,
const bool  identical_buffer_sizes = false 
) const
inline

Take a vector of dynamic-size local variables and collect similar vectors from all processors.

◆ allgather() [7/8]

template<typename T , typename A , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type = 0>
void TIMPI::Communicator::allgather ( std::vector< T, A > &  r,
const bool  identical_buffer_sizes = false 
) const
inline


Take a vector of possibly dynamically sized local variables and expand it to include values from all processors.

By default, each processor is allowed to have its own unique input buffer length. If it is known that all processors have the same input sizes additional communication can be avoided.

Specifically, this function transforms this:

* Processor 0: [ ... N_0 ]
* Processor 1: [ ....... N_1 ]
* ...
* Processor M: [ .. N_M]
* 

into this:

* [ [ ... N_0 ] [ ....... N_1 ] ... [ .. N_M] ]
* 

on each processor. This function is collective and therefore must be called by all processors in the Communicator.

◆ allgather() [8/8]

template<typename T , typename A >
void TIMPI::Communicator::allgather ( std::vector< std::basic_string< T >, A > &  r,
const bool  identical_buffer_sizes = false 
) const
inline

AllGather overload for vectors of string types.

Definition at line 3356 of file parallel_implementation.h.

References allgather(), size(), and verify().

3358 {
3359  if (this->size() < 2)
3360  return;
3361 
3362  TIMPI_LOG_SCOPE("allgather()", "Parallel");
3363 
3364  if (identical_buffer_sizes)
3365  {
3366  timpi_assert(this->verify(r.size()));
3367 
3368  // identical_buffer_sizes doesn't buy us much since we have to
3369  // communicate the lengths of strings within each buffer anyway
3370  if (r.empty())
3371  return;
3372  }
3373 
3374  // Concatenate the input buffer into a send buffer, and keep track
3375  // of input string lengths
3376  std::vector<int> mystrlengths (r.size());
3377  std::vector<T> concat_src;
3378 
3379  int myconcatsize = 0;
3380  for (unsigned int i=0; i != r.size(); ++i)
3381  {
3382  int stringlen = cast_int<int>(r[i].size());
3383  mystrlengths[i] = stringlen;
3384  myconcatsize += stringlen;
3385  }
3386  concat_src.reserve(myconcatsize);
3387  for (unsigned int i=0; i != r.size(); ++i)
3388  concat_src.insert
3389  (concat_src.end(), r[i].begin(), r[i].end());
3390 
3391  // Get the string lengths from all other processors
3392  std::vector<int> strlengths = mystrlengths;
3393  this->allgather(strlengths, identical_buffer_sizes);
3394 
3395  // We now know how many strings we'll be receiving
3396  r.resize(strlengths.size());
3397 
3398  // Get the concatenated data sizes from all other processors
3399  std::vector<int> concat_sizes;
3400  this->allgather(myconcatsize, concat_sizes);
3401 
3402  // Find the total size of the final concatenated array and
3403  // set up the displacement offsets for each processor.
3404  std::vector<int> displacements(this->size(), 0);
3405  unsigned int globalsize = 0;
3406  for (unsigned int i=0; i != this->size(); ++i)
3407  {
3408  displacements[i] = globalsize;
3409  globalsize += concat_sizes[i];
3410  }
3411 
3412  // Check for quick return
3413  if (globalsize == 0)
3414  return;
3415 
3416  // Get the concatenated data from the remote processors.
3417  // Pass nullptr if our vector is empty.
3418  std::vector<T> concat(globalsize);
3419 
3420  // We may have concat_src.empty(), but we know concat has at least
3421  // one element we can use as an example for StandardType
3422  StandardType<T> send_type(concat.data());
3423 
3424  timpi_call_mpi
3425  (MPI_Allgatherv (concat_src.empty() ?
3426  nullptr : concat_src.data(), myconcatsize,
3427  send_type, concat.data(), concat_sizes.data(),
3428  displacements.data(), send_type, this->get()));
3429 
3430  // Finally, split concatenated data into strings
3431  const T * begin = concat.data();
3432  for (unsigned int i=0; i != r.size(); ++i)
3433  {
3434  const T * end = begin + strlengths[i];
3435  r[i].assign(begin, end);
3436  begin = end;
3437  }
3438 }
bool verify(const T &r) const
Verify that a local variable has the same value on all processors.
void allgather(const T &send_data, std::vector< T, A > &recv_data) const
Take a vector of length this->size(), and fill in recv[processor_id] = the value of send on that proc...
processor_id_type size() const
Definition: communicator.h:208

◆ allgather_packed_range()

template<typename Context , typename Iter , typename OutputIter >
void TIMPI::Communicator::allgather_packed_range ( Context *  context,
Iter  range_begin,
const Iter  range_end,
OutputIter  out,
std::size_t  approx_buffer_size = 1000000 
) const
inline

Take a range of local variables, combine it with ranges from all processors, and write the output to the output iterator.

The approximate maximum size (in entries; number of bytes will likely be 4x or 8x larger) to use in a single data vector buffer to send can be specified for performance or memory usage reasons; if the range cannot be packed into a single buffer of this size then multiple buffers and messages will be used.

Note that the received data vector sizes will be the sum of the sent vector sizes; a smaller-than-default size may be useful for users on many processors, in cases where all-to-one communication cannot be avoided entirely.

Definition at line 4004 of file parallel_implementation.h.

References allgather(), max(), TIMPI::pack_range(), and TIMPI::unpack_range().

Referenced by testGettableStringAllGather(), testNestingAllGather(), testNullAllGather(), and testTupleStringAllGather().

4009 {
4010  typedef typename std::iterator_traits<Iter>::value_type T;
4011  typedef typename Packing<T>::buffer_type buffer_t;
4012 
4013  bool nonempty_range = (range_begin != range_end);
4014  this->max(nonempty_range);
4015 
4016  while (nonempty_range)
4017  {
4018  // We will serialize variable size objects from *range_begin to
4019  // *range_end as a sequence of ints in this buffer
4020  std::vector<buffer_t> buffer;
4021 
4022  range_begin = pack_range
4023  (context, range_begin, range_end, buffer, approx_buffer_size);
4024 
4025  this->allgather(buffer, false);
4026 
4027  timpi_assert(buffer.size());
4028 
4029  unpack_range
4030  (buffer, context, out_iter, (T*)nullptr);
4031 
4032  nonempty_range = (range_begin != range_end);
4033  this->max(nonempty_range);
4034  }
4035 }
void allgather(const T &send_data, std::vector< T, A > &recv_data) const
Take a vector of length this->size(), and fill in recv[processor_id] = the value of send on that proc...
void unpack_range(const std::vector< buffertype > &buffer, Context *context, OutputIter out_iter, const T *)
Helper function for range unpacking.
Definition: packing.h:1092
Iter pack_range(const Context *context, Iter range_begin, const Iter range_end, std::vector< buffertype > &buffer, std::size_t approx_buffer_size)
Helper function for range packing.
Definition: packing.h:1037
void max(const T &r, T &o, Request &req) const
Non-blocking maximum of the local value r into o with the request req.

◆ alltoall()

template<typename T , typename A >
void TIMPI::Communicator::alltoall ( std::vector< T, A > &  r) const
inline

Effectively transposes the input vector across all processors.

The jth entry on processor i is replaced with the ith entry from processor j.

Definition at line 3620 of file parallel_implementation.h.

References TIMPI::ignore(), size(), and verify().

Referenced by TIMPI::detail::push_parallel_alltoall_helper().

3621 {
3622  if (this->size() < 2 || buf.empty())
3623  return;
3624 
3625  TIMPI_LOG_SCOPE("alltoall()", "Parallel");
3626 
3627  // the per-processor size. this is the same for all
3628  // processors using MPI_Alltoall, could be variable
3629  // using MPI_Alltoallv
3630  const int size_per_proc =
3631  cast_int<int>(buf.size()/this->size());
3632  ignore(size_per_proc);
3633 
3634  timpi_assert_equal_to (buf.size()%this->size(), 0);
3635 
3636  timpi_assert(this->verify(size_per_proc));
3637 
3638  StandardType<T> send_type(buf.data());
3639 
3640  timpi_call_mpi
3641  (MPI_Alltoall (MPI_IN_PLACE, size_per_proc, send_type, buf.data(),
3642  size_per_proc, send_type, this->get()));
3643 }
bool verify(const T &r) const
Verify that a local variable has the same value on all processors.
void ignore(const Args &...)
Definition: timpi_assert.h:54
processor_id_type size() const
Definition: communicator.h:208

◆ assign()

void TIMPI::Communicator::assign ( const communicator comm)
private

Utility function for setting our member variables from an MPI communicator.

Definition at line 185 of file communicator.C.

References _communicator, _max_tag, _next_tag, _rank, and _size.

Referenced by allgather(), Communicator(), duplicate(), operator=(), split(), and split_by_type().

186 {
187  _communicator = comm;
188 #ifdef TIMPI_HAVE_MPI
189  if (_communicator != MPI_COMM_NULL)
190  {
191  int i;
192  timpi_call_mpi
193  (MPI_Comm_size(_communicator, &i));
194 
195  timpi_assert_greater_equal (i, 0);
196  _size = cast_int<processor_id_type>(i);
197 
198  timpi_call_mpi
199  (MPI_Comm_rank(_communicator, &i));
200 
201  timpi_assert_greater_equal (i, 0);
202  _rank = cast_int<processor_id_type>(i);
203 
204  int * maxTag;
205  int flag = false;
206  timpi_call_mpi(MPI_Comm_get_attr(MPI_COMM_WORLD, MPI_TAG_UB, &maxTag, &flag));
207  timpi_assert(flag);
208  _max_tag = *maxTag;
209  }
210  else
211  {
212  _rank = 0;
213  _size = 1;
214  _max_tag = std::numeric_limits<int>::max();
215  }
216  _next_tag = _max_tag / 2;
217 #endif
218 }
communicator _communicator
Definition: communicator.h:229
processor_id_type _size
Definition: communicator.h:230
processor_id_type _rank
Definition: communicator.h:230

◆ barrier()

void TIMPI::Communicator::barrier ( ) const

Pause execution until all processors reach a certain point.

Definition at line 225 of file communicator.C.

References size().

Referenced by testBarrier(), and TIMPI::TIMPIInit::~TIMPIInit().

226 {
227  if (this->size() > 1)
228  {
229  TIMPI_LOG_SCOPE("barrier()", "Communicator");
230  timpi_call_mpi(MPI_Barrier (this->get()));
231  }
232 }
processor_id_type size() const
Definition: communicator.h:208

◆ broadcast() [1/5]

template<typename T , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type >
void TIMPI::Communicator::broadcast ( T &  data,
const unsigned int  root_id = 0,
const bool  identical_sizes = false 
) const
inline

Take a local value and broadcast it to all processors.

Optionally takes the root_id processor, which specifies the processor initiating the broadcast.

If data is a container, it will be resized on target processors. When using pre-sized target containers, specify identical_sizes=true on all processors for an optimization.

Fixed variant

Definition at line 3678 of file parallel_implementation.h.

References broadcast_packed_range(), TIMPI::ignore(), rank(), and size().

Referenced by broadcast(), broadcast_packed_range(), map_broadcast(), scatter(), testBroadcast(), testBroadcastArrayType(), testBroadcastNestedType(), testBroadcastString(), testContainerBroadcast(), and testVectorOfContainersBroadcast().

3681 {
3682  ignore(root_id); // Only needed for MPI and/or dbg/devel
3683  if (this->size() == 1)
3684  {
3685  timpi_assert (!this->rank());
3686  timpi_assert (!root_id);
3687  return;
3688  }
3689 
3690  timpi_assert_less (root_id, this->size());
3691 
3692 // // If we don't have MPI, then we should be done, and calling the below can
3693 // // have the side effect of instantiating Packing<T> classes that are not
3694 // // defined. (Normally we would be calling a more specialized overload of
3695 // // broacast that would then call broadcast_packed_range with appropriate
3696 // // template arguments)
3697 // #ifdef TIMPI_HAVE_MPI
3698  std::vector<T> range = {data};
3699 
3700  this->broadcast_packed_range((void *)(nullptr),
3701  range.begin(),
3702  range.end(),
3703  (void *)(nullptr),
3704  range.begin(),
3705  root_id);
3706 
3707  data = range[0];
3708 // #endif
3709 }
void broadcast_packed_range(const Context *context1, Iter range_begin, const Iter range_end, OutputContext *context2, OutputIter out, const unsigned int root_id=0, std::size_t approx_buffer_size=1000000) const
Blocking-broadcast range-of-pointers to one processor.
processor_id_type rank() const
Definition: communicator.h:206
void ignore(const Args &...)
Definition: timpi_assert.h:54
processor_id_type size() const
Definition: communicator.h:208

◆ broadcast() [2/5]

template<typename T , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type = 0>
void TIMPI::Communicator::broadcast ( T &  data,
const unsigned int  root_id = 0,
const bool  identical_sizes = false 
) const
inline

Take a possibly dynamically-sized local value and broadcast it to all processors.

Optionally takes the root_id processor, which specifies the processor initiating the broadcast.

If data is a container, it will be resized on target processors. When using pre-sized target containers, specify identical_sizes=true on all processors for an optimization.

Dynamic variant

◆ broadcast() [3/5]

template<typename T >
void TIMPI::Communicator::broadcast ( std::basic_string< T > &  data,
const unsigned int  root_id,
const bool  identical_sizes 
) const
inline

Definition at line 1676 of file parallel_implementation.h.

References broadcast(), rank(), size(), and verify().

1679 {
1680  if (this->size() == 1)
1681  {
1682  timpi_assert (!this->rank());
1683  timpi_assert (!root_id);
1684  return;
1685  }
1686 
1687  timpi_assert_less (root_id, this->size());
1688  timpi_assert (this->verify(identical_sizes));
1689 
1690  TIMPI_LOG_SCOPE("broadcast()", "Parallel");
1691 
1692  std::size_t data_size = data.size();
1693 
1694  if (identical_sizes)
1695  timpi_assert(this->verify(data_size));
1696  else
1697  this->broadcast(data_size, root_id);
1698 
1699  std::vector<T> data_c(data_size);
1700 #ifndef NDEBUG
1701  std::basic_string<T> orig(data);
1702 #endif
1703 
1704  if (this->rank() == root_id)
1705  for (std::size_t i=0; i<data.size(); i++)
1706  data_c[i] = data[i];
1707 
1708  this->broadcast (data_c, root_id, StandardType<T>::is_fixed_type);
1709 
1710  data.assign(data_c.begin(), data_c.end());
1711 
1712 #ifndef NDEBUG
1713  if (this->rank() == root_id)
1714  timpi_assert_equal_to (data, orig);
1715 #endif
1716 }
bool verify(const T &r) const
Verify that a local variable has the same value on all processors.
processor_id_type rank() const
Definition: communicator.h:206
processor_id_type size() const
Definition: communicator.h:208
static const bool is_fixed_type
Definition: data_type.h:130
void broadcast(T &data, const unsigned int root_id=0, const bool identical_sizes=false) const
Take a local value and broadcast it to all processors.

◆ broadcast() [4/5]

template<typename T , typename A >
void TIMPI::Communicator::broadcast ( std::vector< std::basic_string< T >, A > &  data,
const unsigned int  root_id,
const bool  identical_sizes 
) const
inline

The strings will be packed in one long array with the size of each string preceding the actual characters

Definition at line 1720 of file parallel_implementation.h.

References broadcast(), rank(), size(), and verify().

1723 {
1724  if (this->size() == 1)
1725  {
1726  timpi_assert (!this->rank());
1727  timpi_assert (!root_id);
1728  return;
1729  }
1730 
1731  timpi_assert_less (root_id, this->size());
1732  timpi_assert (this->verify(identical_sizes));
1733 
1734  TIMPI_LOG_SCOPE("broadcast()", "Parallel");
1735 
1736  std::size_t bufsize=0;
1737  if (root_id == this->rank() || identical_sizes)
1738  {
1739  for (std::size_t i=0; i<data.size(); ++i)
1740  bufsize += data[i].size() + 1; // Add one for the string length word
1741  }
1742 
1743  if (identical_sizes)
1744  timpi_assert(this->verify(bufsize));
1745  else
1746  this->broadcast(bufsize, root_id);
1747 
1748  // Here we use unsigned int to store up to 32-bit characters
1749  std::vector<unsigned int> temp; temp.reserve(bufsize);
1750  // Pack the strings
1751  if (root_id == this->rank())
1752  {
1753  for (std::size_t i=0; i<data.size(); ++i)
1754  {
1755  temp.push_back(cast_int<unsigned int>(data[i].size()));
1756  for (std::size_t j=0; j != data[i].size(); ++j)
1761  temp.push_back(data[i][j]);
1762  }
1763  }
1764  else
1765  temp.resize(bufsize);
1766 
1767  // broad cast the packed strings
1768  this->broadcast(temp, root_id, true);
1769 
1770  // Unpack the strings
1771  if (root_id != this->rank())
1772  {
1773  data.clear();
1774  typename std::vector<unsigned int>::const_iterator iter = temp.begin();
1775  while (iter != temp.end())
1776  {
1777  std::size_t curr_len = *iter++;
1778  data.push_back(std::basic_string<T>(iter, iter+curr_len));
1779  iter += curr_len;
1780  }
1781  }
1782 }
bool verify(const T &r) const
Verify that a local variable has the same value on all processors.
processor_id_type rank() const
Definition: communicator.h:206
processor_id_type size() const
Definition: communicator.h:208
void broadcast(T &data, const unsigned int root_id=0, const bool identical_sizes=false) const
Take a local value and broadcast it to all processors.

◆ broadcast() [5/5]

template<typename T #ifdef TIMPI_HAVE_MPI, typename std::enable_if< std::is_base_of< DataType, StandardType< T >>::value, int >::type # endif>
void TIMPI::Communicator::broadcast ( T &  timpi_mpi_vardata,
const unsigned int  root_id,
const bool   
) const
inline

Definition at line 3653 of file parallel_implementation.h.

References TIMPI::ignore(), rank(), and size().

3656 {
3657  ignore(root_id); // Only needed for MPI and/or dbg/devel
3658  if (this->size() == 1)
3659  {
3660  timpi_assert (!this->rank());
3661  timpi_assert (!root_id);
3662  return;
3663  }
3664 
3665  timpi_assert_less (root_id, this->size());
3666 
3667  TIMPI_LOG_SCOPE("broadcast()", "Parallel");
3668 
3669  // Spread data to remote processors.
3670  timpi_call_mpi
3671  (MPI_Bcast (&data, 1, StandardType<T>(&data), root_id,
3672  this->get()));
3673 }
processor_id_type rank() const
Definition: communicator.h:206
void ignore(const Args &...)
Definition: timpi_assert.h:54
processor_id_type size() const
Definition: communicator.h:208

◆ broadcast_packed_range()

template<typename Context , typename OutputContext , typename Iter , typename OutputIter >
void TIMPI::Communicator::broadcast_packed_range ( const Context *  context1,
Iter  range_begin,
const Iter  range_end,
OutputContext *  context2,
OutputIter  out,
const unsigned int  root_id = 0,
std::size_t  approx_buffer_size = 1000000 
) const
inline

Blocking-broadcast range-of-pointers to one processor.

This function does not send the raw pointers, but rather constructs new objects at the other end whose contents match the objects pointed to by the sender.

void TIMPI::pack(const T *, vector<int> & data, const Context *) is used to serialize type T onto the end of a data vector.

unsigned int TIMPI::packable_size(const T *, const Context *) is used to allow data vectors to reserve memory, and for additional error checking

unsigned int TIMPI::packed_size(const T *, vector<int>::const_iterator) is used to advance to the beginning of the next object's data.

The approximate maximum size (in entries; number of bytes will likely be 4x or 8x larger) to use in a single data vector buffer can be specified for performance or memory usage reasons; if the range cannot be packed into a single buffer of this size then multiple buffers and messages will be used.

Definition at line 3920 of file parallel_implementation.h.

References broadcast(), TIMPI::pack_range(), rank(), size(), and TIMPI::unpack_range().

Referenced by broadcast().

3927 {
3928  typedef typename std::iterator_traits<Iter>::value_type T;
3929  typedef typename Packing<T>::buffer_type buffer_t;
3930 
3931  if (this->size() == 1)
3932  {
3933  timpi_assert (!this->rank());
3934  timpi_assert (!root_id);
3935  return;
3936  }
3937 
3938  do
3939  {
3940  // We will serialize variable size objects from *range_begin to
3941  // *range_end as a sequence of ints in this buffer
3942  std::vector<buffer_t> buffer;
3943 
3944  if (this->rank() == root_id)
3945  range_begin = pack_range
3946  (context1, range_begin, range_end, buffer, approx_buffer_size);
3947 
3948  // this->broadcast(vector) requires the receiving vectors to
3949  // already be the appropriate size
3950  std::size_t buffer_size = buffer.size();
3951  this->broadcast (buffer_size, root_id);
3952 
3953  // We continue until there's nothing left to broadcast
3954  if (!buffer_size)
3955  break;
3956 
3957  buffer.resize(buffer_size);
3958 
3959  // Broadcast the packed data
3960  this->broadcast (buffer, root_id);
3961 
3962  if (this->rank() != root_id)
3963  unpack_range
3964  (buffer, context2, out_iter, (T*)nullptr);
3965  } while (true); // break above when we reach buffer_size==0
3966 }
processor_id_type rank() const
Definition: communicator.h:206
void unpack_range(const std::vector< buffertype > &buffer, Context *context, OutputIter out_iter, const T *)
Helper function for range unpacking.
Definition: packing.h:1092
processor_id_type size() const
Definition: communicator.h:208
Iter pack_range(const Context *context, Iter range_begin, const Iter range_end, std::vector< buffertype > &buffer, std::size_t approx_buffer_size)
Helper function for range packing.
Definition: packing.h:1037
void broadcast(T &data, const unsigned int root_id=0, const bool identical_sizes=false) const
Take a local value and broadcast it to all processors.

◆ clear()

void TIMPI::Communicator::clear ( )

Free and reset this communicator.

Definition at line 162 of file communicator.C.

References _communicator, and _I_duped_it.

Referenced by operator=(), split(), split_by_type(), and ~Communicator().

162  {
163 #ifdef TIMPI_HAVE_MPI
164  if (_I_duped_it)
165  {
166  timpi_assert (_communicator != MPI_COMM_NULL);
167  timpi_call_mpi
168  (MPI_Comm_free(&_communicator));
169 
170  _communicator = MPI_COMM_NULL;
171  }
172  _I_duped_it = false;
173 #endif
174 }
communicator _communicator
Definition: communicator.h:229

◆ dereference_unique_tag()

void TIMPI::Communicator::dereference_unique_tag ( int  tagvalue) const

Dereference an already-acquired tag, and see if we can re-release it.

Definition at line 46 of file communicator.C.

References used_tag_values.

Referenced by TIMPI::MessageTag::operator=(), and TIMPI::MessageTag::~MessageTag().

47 {
48  // This had better be an already-acquired tag.
49  timpi_assert(used_tag_values.count(tagvalue));
50 
51  used_tag_values[tagvalue]--;
52  // If we don't have any more outstanding references, we
53  // don't even need to keep this tag in our "used" set.
54  if (!used_tag_values[tagvalue])
55  used_tag_values.erase(tagvalue);
56 }
std::map< int, unsigned int > used_tag_values
Definition: communicator.h:236

◆ duplicate() [1/2]

void TIMPI::Communicator::duplicate ( const Communicator comm)

Definition at line 137 of file communicator.C.

References _communicator, send_mode(), and sync_type().

Referenced by testGetUniqueTagAuto(), and testStringSyncType().

138 {
139  this->duplicate(comm._communicator);
140  this->send_mode(comm.send_mode());
141  this->sync_type(comm.sync_type());
142 }
SyncType sync_type() const
Gets the user-requested SyncType.
Definition: communicator.h:348
SendMode send_mode() const
Gets the user-requested SendMode.
Definition: communicator.h:331
void duplicate(const Communicator &comm)
Definition: communicator.C:137

◆ duplicate() [2/2]

void TIMPI::Communicator::duplicate ( const communicator comm)

Definition at line 146 of file communicator.C.

References _communicator, _I_duped_it, and assign().

147 {
148  if (_communicator != MPI_COMM_NULL)
149  {
150  timpi_call_mpi
151  (MPI_Comm_dup(comm, &_communicator));
152 
153  _I_duped_it = true;
154  }
155  this->assign(_communicator);
156 }
communicator _communicator
Definition: communicator.h:229
void assign(const communicator &comm)
Utility function for setting our member variables from an MPI communicator.
Definition: communicator.C:185

◆ gather() [1/4]

template<typename T , typename A >
void TIMPI::Communicator::gather ( const unsigned int  root_id,
const T &  send_data,
std::vector< T, A > &  recv 
) const
inline

Take a vector of length comm.size(), and on processor root_id fill in recv[processor_id] = the value of send on processor processor_id.

Definition at line 3031 of file parallel_implementation.h.

References rank(), and size().

Referenced by gather(), gather_packed_range(), testGather(), testGatherString(), and testGatherString2().

3034 {
3035  timpi_assert_less (root_id, this->size());
3036 
3037  if (this->rank() == root_id)
3038  recv.resize(this->size());
3039 
3040  if (this->size() > 1)
3041  {
3042  TIMPI_LOG_SCOPE("gather()", "Parallel");
3043 
3044  StandardType<T> send_type(&sendval);
3045 
3046  timpi_assert_less(root_id, this->size());
3047 
3048  timpi_call_mpi
3049  (MPI_Gather(const_cast<T*>(&sendval), 1, send_type,
3050  recv.empty() ? nullptr : recv.data(), 1, send_type,
3051  root_id, this->get()));
3052  }
3053  else
3054  recv[0] = sendval;
3055 }
processor_id_type rank() const
Definition: communicator.h:206
processor_id_type size() const
Definition: communicator.h:208

◆ gather() [2/4]

template<typename T , typename A >
void TIMPI::Communicator::gather ( const unsigned int  root_id,
const std::basic_string< T > &  send_data,
std::vector< std::basic_string< T >, A > &  recv_data,
const bool  identical_buffer_sizes = false 
) const
inline

The gather overload for string types has an optional identical_buffer_sizes optimization for when all strings are the same length.

Definition at line 3130 of file parallel_implementation.h.

References gather(), rank(), and size().

3134 {
3135  timpi_assert_less (root_id, this->size());
3136 
3137  if (this->rank() == root_id)
3138  recv.resize(this->size());
3139 
3140  if (this->size() > 1)
3141  {
3142  TIMPI_LOG_SCOPE ("gather()","Parallel");
3143 
3144  std::vector<int>
3145  sendlengths (this->size(), 0),
3146  displacements(this->size(), 0);
3147 
3148  const int mysize = static_cast<int>(sendval.size());
3149 
3150  if (identical_buffer_sizes)
3151  sendlengths.assign(this->size(), mysize);
3152  else
3153  // first comm step to determine buffer sizes from all processors
3154  this->gather(root_id, mysize, sendlengths);
3155 
3156  // Find the total size of the final array and
3157  // set up the displacement offsets for each processor
3158  unsigned int globalsize = 0;
3159  for (unsigned int i=0; i < this->size(); ++i)
3160  {
3161  displacements[i] = globalsize;
3162  globalsize += sendlengths[i];
3163  }
3164 
3165  // monolithic receive buffer
3166  std::basic_string<T> r;
3167  if (this->rank() == root_id)
3168  r.resize(globalsize, 0);
3169 
3170  timpi_assert_less(root_id, this->size());
3171 
3172  // and get the data from the remote processors.
3173  timpi_call_mpi
3174  (MPI_Gatherv (const_cast<T*>(sendval.data()),
3175  mysize, StandardType<T>(),
3176  this->rank() == root_id ? &r[0] : nullptr,
3177  sendlengths.data(), displacements.data(),
3178  StandardType<T>(), root_id, this->get()));
3179 
3180  // slice receive buffer up
3181  if (this->rank() == root_id)
3182  for (unsigned int i=0; i != this->size(); ++i)
3183  recv[i] = r.substr(displacements[i], sendlengths[i]);
3184  }
3185  else
3186  recv[0] = sendval;
3187 }
void gather(const unsigned int root_id, const T &send_data, std::vector< T, A > &recv) const
Take a vector of length comm.size(), and on processor root_id fill in recv[processor_id] = the value ...
processor_id_type rank() const
Definition: communicator.h:206
processor_id_type size() const
Definition: communicator.h:208

◆ gather() [3/4]

template<typename T , typename A , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type >
void TIMPI::Communicator::gather ( const unsigned int  root_id,
std::vector< T, A > &  r 
) const
inline


Take a vector of local variables and expand it on processor root_id to include values from all processors

This handles the case where the lengths of the vectors may vary. Specifically, this function transforms this:

* Processor 0: [ ... N_0 ]
* Processor 1: [ ....... N_1 ]
* ...
* Processor M: [ .. N_M]
* 

into this:

* [ [ ... N_0 ] [ ....... N_1 ] ... [ .. N_M] ]
* 

on processor root_id. This function is collective and therefore must be called by all processors in the Communicator.

If the type T is a standard (fixed-size) type then we use a standard MPI_Gatherv; if it is a packable variable-size type then we dispatch to gather_packed_range.

Definition at line 3061 of file parallel_implementation.h.

References allgather(), rank(), and size().

3063 {
3064  if (this->size() == 1)
3065  {
3066  timpi_assert (!this->rank());
3067  timpi_assert (!root_id);
3068  return;
3069  }
3070 
3071  timpi_assert_less (root_id, this->size());
3072 
3073  std::vector<int>
3074  sendlengths (this->size(), 0),
3075  displacements(this->size(), 0);
3076 
3077  const int mysize = static_cast<int>(r.size());
3078  this->allgather(mysize, sendlengths);
3079 
3080  TIMPI_LOG_SCOPE("gather()", "Parallel");
3081 
3082  // Find the total size of the final array and
3083  // set up the displacement offsets for each processor.
3084  unsigned int globalsize = 0;
3085  for (unsigned int i=0; i != this->size(); ++i)
3086  {
3087  displacements[i] = globalsize;
3088  globalsize += sendlengths[i];
3089  }
3090 
3091  // Check for quick return
3092  if (globalsize == 0)
3093  return;
3094 
3095  // copy the input buffer
3096  std::vector<T,A> r_src(r);
3097 
3098  // now resize it to hold the global data
3099  // on the receiving processor
3100  if (root_id == this->rank())
3101  r.resize(globalsize);
3102 
3103  timpi_assert_less(root_id, this->size());
3104 
3105  // and get the data from the remote processors
3106  timpi_call_mpi
3107  (MPI_Gatherv (r_src.empty() ? nullptr : r_src.data(), mysize,
3108  StandardType<T>(), r.empty() ? nullptr : r.data(),
3109  sendlengths.data(), displacements.data(),
3110  StandardType<T>(), root_id, this->get()));
3111 }
void allgather(const T &send_data, std::vector< T, A > &recv_data) const
Take a vector of length this->size(), and fill in recv[processor_id] = the value of send on that proc...
processor_id_type rank() const
Definition: communicator.h:206
processor_id_type size() const
Definition: communicator.h:208

◆ gather() [4/4]

template<typename T , typename A , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type = 0>
void TIMPI::Communicator::gather ( const unsigned int  root_id,
std::vector< T, A > &  r 
) const
inline

◆ gather_packed_range()

template<typename Context , typename Iter , typename OutputIter >
void TIMPI::Communicator::gather_packed_range ( const unsigned int  root_id,
Context *  context,
Iter  range_begin,
const Iter  range_end,
OutputIter  out,
std::size_t  approx_buffer_size = 1000000 
) const
inline

Take a range of local variables, combine it with ranges from all processors, and write the output to the output iterator on rank root.

The approximate maximum size (in entries; number of bytes will likely be 4x or 8x larger) to use in a single data vector buffer to send can be specified for performance or memory usage reasons; if the range cannot be packed into a single buffer of this size then multiple buffers and messages will be used.

Note that the received data vector sizes will be the sum of the sent vector sizes; a smaller-than-default size may be useful for users on many processors, in cases where all-to-one communication cannot be avoided entirely.

Definition at line 3970 of file parallel_implementation.h.

References gather(), max(), TIMPI::pack_range(), and TIMPI::unpack_range().

3976 {
3977  typedef typename std::iterator_traits<Iter>::value_type T;
3978  typedef typename Packing<T>::buffer_type buffer_t;
3979 
3980  bool nonempty_range = (range_begin != range_end);
3981  this->max(nonempty_range);
3982 
3983  while (nonempty_range)
3984  {
3985  // We will serialize variable size objects from *range_begin to
3986  // *range_end as a sequence of ints in this buffer
3987  std::vector<buffer_t> buffer;
3988 
3989  range_begin = pack_range
3990  (context, range_begin, range_end, buffer, approx_buffer_size);
3991 
3992  this->gather(root_id, buffer);
3993 
3994  unpack_range
3995  (buffer, context, out_iter, (T*)(nullptr));
3996 
3997  nonempty_range = (range_begin != range_end);
3998  this->max(nonempty_range);
3999  }
4000 }
void gather(const unsigned int root_id, const T &send_data, std::vector< T, A > &recv) const
Take a vector of length comm.size(), and on processor root_id fill in recv[processor_id] = the value ...
void unpack_range(const std::vector< buffertype > &buffer, Context *context, OutputIter out_iter, const T *)
Helper function for range unpacking.
Definition: packing.h:1092
Iter pack_range(const Context *context, Iter range_begin, const Iter range_end, std::vector< buffertype > &buffer, std::size_t approx_buffer_size)
Helper function for range packing.
Definition: packing.h:1037
void max(const T &r, T &o, Request &req) const
Non-blocking maximum of the local value r into o with the request req.

◆ get() [1/2]

communicator& TIMPI::Communicator::get ( )
inline

Definition at line 165 of file communicator.h.

References _communicator.

Referenced by TIMPI::PostWaitUnpackNestedBuffer< Container >::run(), and testMPIULongMin().

165 { return _communicator; }
communicator _communicator
Definition: communicator.h:229

◆ get() [2/2]

const communicator& TIMPI::Communicator::get ( ) const
inline

Definition at line 167 of file communicator.h.

References _communicator.

167 { return _communicator; }
communicator _communicator
Definition: communicator.h:229

◆ get_unique_tag()

MessageTag TIMPI::Communicator::get_unique_tag ( int  tagvalue = MessageTag::invalid_tag) const

Get a tag that is unique to this Communicator.

A requested tag value may be provided. If no request is made then an automatic unique tag value will be generated; such usage of get_unique_tag() must be done on every processor in a consistent order.

Note
If people are also using magic numbers or copying raw communicators around then we can't guarantee the tag is unique to this MPI_Comm.
Leaving tagvalue unspecified is recommended in most cases. Manually selecting tag values is dangerous, as tag values may be freed and reselected earlier than expected in asynchronous communication algorithms.

Definition at line 251 of file communicator.C.

References _max_tag, _next_tag, TIMPI::MessageTag::invalid_tag, max(), and used_tag_values.

Referenced by TIMPI::pull_parallel_vector_data(), TIMPI::detail::push_parallel_alltoall_helper(), TIMPI::detail::push_parallel_nbx_helper(), TIMPI::detail::push_parallel_roundrobin_helper(), testGetUniqueTagAuto(), testGetUniqueTagManual(), and testGetUniqueTagOverlap().

252 {
253  if (tagvalue == MessageTag::invalid_tag)
254  {
255 #ifndef NDEBUG
256  // Automatic tag values have to be requested in sync
257  int maxval = _next_tag;
258  this->max(maxval);
259  timpi_assert_equal_to(_next_tag, maxval);
260 #endif
261  tagvalue = _next_tag++;
262  }
263 
264  if (used_tag_values.count(tagvalue))
265  {
266  // Get the largest value in the used values, and pick one
267  // larger
268  tagvalue = used_tag_values.rbegin()->first+1;
269  timpi_assert(!used_tag_values.count(tagvalue));
270  }
271  if (tagvalue >= _next_tag)
272  _next_tag = tagvalue+1;
273 
274  if (_next_tag >= _max_tag)
275  _next_tag = _max_tag/2;
276 
277  used_tag_values[tagvalue] = 1;
278 
279  return MessageTag(tagvalue, this);
280 }
std::map< int, unsigned int > used_tag_values
Definition: communicator.h:236
static const int invalid_tag
Invalid tag, to allow for default construction.
Definition: message_tag.h:53
void max(const T &r, T &o, Request &req) const
Non-blocking maximum of the local value r into o with the request req.

◆ map_broadcast() [1/4]

template<typename Map , typename std::enable_if< std::is_base_of< DataType, StandardType< typename Map::key_type >>::value &&std::is_base_of< DataType, StandardType< typename Map::mapped_type >>::value, int >::type = 0>
void TIMPI::Communicator::map_broadcast ( Map &  data,
const unsigned int  root_id,
const bool  identical_sizes 
) const
private

Private implementation function called by the map-based broadcast() specializations.

This is_fixed_type variant saves a communication by broadcasting pairs

◆ map_broadcast() [2/4]

template<typename Map >
void TIMPI::Communicator::map_broadcast ( Map &  data,
const unsigned int  root_id,
const bool  identical_sizes 
) const
private

Private implementation function called by the map-based broadcast() specializations.

This !is_fixed_type variant makes two broadcasts, which is slower but gives more control over reach broadcast (e.g. we may need to specialize for either key_type or mapped_type)

◆ map_broadcast() [3/4]

template<typename Map , typename std::enable_if< std::is_base_of< DataType, StandardType< typename Map::key_type >>::value &&std::is_base_of< DataType, StandardType< typename Map::mapped_type >>::value, int >::type >
void TIMPI::Communicator::map_broadcast ( Map &  timpi_mpi_vardata,
const unsigned int  root_id,
const bool   timpi_mpi_varidentical_sizes 
) const
inline

Definition at line 3795 of file parallel_implementation.h.

References broadcast(), TIMPI::ignore(), rank(), size(), and verify().

3798 {
3799  ignore(root_id); // Only needed for MPI and/or dbg/devel
3800  if (this->size() == 1)
3801  {
3802  timpi_assert (!this->rank());
3803  timpi_assert (!root_id);
3804  return;
3805  }
3806 
3807 #ifdef TIMPI_HAVE_MPI
3808  timpi_assert_less (root_id, this->size());
3809  timpi_assert (this->verify(identical_sizes));
3810 
3811  TIMPI_LOG_SCOPE("broadcast(map)", "Parallel");
3812 
3813  std::size_t data_size=data.size();
3814  if (identical_sizes)
3815  timpi_assert(this->verify(data_size));
3816  else
3817  this->broadcast(data_size, root_id);
3818 
3819  std::vector<std::pair<typename Map::key_type,
3820  typename Map::mapped_type>> comm_data;
3821 
3822  if (root_id == this->rank())
3823  comm_data.assign(data.begin(), data.end());
3824  else
3825  comm_data.resize(data_size);
3826 
3827  this->broadcast(comm_data, root_id, true);
3828 
3829  if (this->rank() != root_id)
3830  {
3831  data.clear();
3832  data.insert(comm_data.begin(), comm_data.end());
3833  }
3834 #endif
3835 }
bool verify(const T &r) const
Verify that a local variable has the same value on all processors.
processor_id_type rank() const
Definition: communicator.h:206
void ignore(const Args &...)
Definition: timpi_assert.h:54
processor_id_type size() const
Definition: communicator.h:208
void broadcast(T &data, const unsigned int root_id=0, const bool identical_sizes=false) const
Take a local value and broadcast it to all processors.

◆ map_broadcast() [4/4]

template<typename Map >
void TIMPI::Communicator::map_broadcast ( Map &  timpi_mpi_vardata,
const unsigned int  root_id,
const bool   timpi_mpi_varidentical_sizes 
) const
inline

Definition at line 3841 of file parallel_implementation.h.

References TIMPI::ignore().

3844 {
3845  ignore(root_id); // Only needed for MPI and/or dbg/devel
3846  if (this->size() == 1)
3847  {
3848  timpi_assert (!this->rank());
3849  timpi_assert (!root_id);
3850  return;
3851  }
3852 
3853 #ifdef TIMPI_HAVE_MPI
3854  timpi_assert_less (root_id, this->size());
3855  timpi_assert (this->verify(identical_sizes));
3856 
3857  TIMPI_LOG_SCOPE("broadcast()", "Parallel");
3858 
3859  std::size_t data_size=data.size();
3860  if (identical_sizes)
3861  timpi_assert(this->verify(data_size));
3862  else
3863  this->broadcast(data_size, root_id);
3864 
3865  std::vector<typename Map::key_type> pair_first; pair_first.reserve(data_size);
3866  std::vector<typename Map::mapped_type> pair_second; pair_first.reserve(data_size);
3867 
3868  if (root_id == this->rank())
3869  {
3870  for (const auto & pr : data)
3871  {
3872  pair_first.push_back(pr.first);
3873  pair_second.push_back(pr.second);
3874  }
3875  }
3876  else
3877  {
3878  pair_first.resize(data_size);
3879  pair_second.resize(data_size);
3880  }
3881 
3882  this->broadcast
3883  (pair_first, root_id,
3885  this->broadcast
3886  (pair_second, root_id,
3888 
3889  timpi_assert(pair_first.size() == pair_first.size());
3890 
3891  if (this->rank() != root_id)
3892  {
3893  data.clear();
3894  for (std::size_t i=0; i<pair_first.size(); ++i)
3895  data[pair_first[i]] = pair_second[i];
3896  }
3897 #endif
3898 }
bool verify(const T &r) const
Verify that a local variable has the same value on all processors.
processor_id_type rank() const
Definition: communicator.h:206
void ignore(const Args &...)
Definition: timpi_assert.h:54
processor_id_type size() const
Definition: communicator.h:208
static const bool is_fixed_type
Definition: data_type.h:130
void broadcast(T &data, const unsigned int root_id=0, const bool identical_sizes=false) const
Take a local value and broadcast it to all processors.

◆ map_max() [1/2]

template<typename Map , typename std::enable_if< std::is_base_of< DataType, StandardType< typename Map::key_type >>::value &&std::is_base_of< DataType, StandardType< typename Map::mapped_type >>::value, int >::type >
void TIMPI::Communicator::map_max ( Map &  data) const
private

Private implementation function called by the map-based max() specializations.

This is_fixed_type variant saves a communication by broadcasting pairs

Definition at line 2407 of file parallel_implementation.h.

References allgather(), and size().

2408 {
2409  if (this->size() > 1)
2410  {
2411  TIMPI_LOG_SCOPE("max(map)", "Parallel");
2412 
2413  // Since the input map may have different keys on different
2414  // processors, we first gather all the keys and values, then for
2415  // each key we choose the max value over all procs. We
2416  // initialize the max with the first value we encounter rather
2417  // than some "global" minimum, since the latter is difficult to
2418  // do generically.
2419  std::vector<std::pair<typename Map::key_type, typename Map::mapped_type>>
2420  vecdata(data.begin(), data.end());
2421 
2422  this->allgather(vecdata, /*identical_buffer_sizes=*/false);
2423 
2424  data.clear();
2425 
2426  for (const auto & pr : vecdata)
2427  {
2428  // Attempt to insert this value. If it works, then the value didn't
2429  // already exist and we can go on. If it fails, compute the std::max
2430  // between the current and existing values.
2431  auto result = data.insert(pr);
2432 
2433  bool inserted = result.second;
2434 
2435  if (!inserted)
2436  {
2437  auto it = result.first;
2438  it->second = std::max(it->second, pr.second);
2439  }
2440  }
2441  }
2442 }
void allgather(const T &send_data, std::vector< T, A > &recv_data) const
Take a vector of length this->size(), and fill in recv[processor_id] = the value of send on that proc...
processor_id_type size() const
Definition: communicator.h:208

◆ map_max() [2/2]

template<typename Map >
void TIMPI::Communicator::map_max ( Map &  data) const
private

Private implementation function called by the map-based max() specializations.

This !is_fixed_type variant calls allgather twice: once for the keys and once for the values.

Definition at line 2450 of file parallel_implementation.h.

2451 {
2452  if (this->size() > 1)
2453  {
2454  TIMPI_LOG_SCOPE("max(map)", "Parallel");
2455 
2456  // Since the input map may have different keys on different
2457  // processors, we first gather all the keys and values, then for
2458  // each key we choose the max value over all procs. We
2459  // initialize the max with the first value we encounter rather
2460  // than some "global" minimum, since the latter is difficult to
2461  // do generically.
2462  std::vector<typename Map::key_type> keys;
2463  std::vector<typename Map::mapped_type> vals;
2464 
2465  auto data_size = data.size();
2466  keys.reserve(data_size);
2467  vals.reserve(data_size);
2468 
2469  for (const auto & pr : data)
2470  {
2471  keys.push_back(pr.first);
2472  vals.push_back(pr.second);
2473  }
2474 
2475  this->allgather(keys, /*identical_buffer_sizes=*/false);
2476  this->allgather(vals, /*identical_buffer_sizes=*/false);
2477 
2478  data.clear();
2479 
2480  for (std::size_t i=0; i<keys.size(); ++i)
2481  {
2482  // Attempt to emplace this value. If it works, then the value didn't
2483  // already exist and we can go on. If it fails, compute the std::max
2484  // between the current and existing values.
2485  auto pr = data.emplace(keys[i], vals[i]);
2486 
2487  bool emplaced = pr.second;
2488 
2489  if (!emplaced)
2490  {
2491  auto it = pr.first;
2492  it->second = std::max(it->second, vals[i]);
2493  }
2494  }
2495  }
2496 }
void allgather(const T &send_data, std::vector< T, A > &recv_data) const
Take a vector of length this->size(), and fill in recv[processor_id] = the value of send on that proc...
processor_id_type size() const
Definition: communicator.h:208

◆ map_sum() [1/2]

template<typename Map , typename std::enable_if< std::is_base_of< DataType, StandardType< typename Map::key_type >>::value &&std::is_base_of< DataType, StandardType< typename Map::mapped_type >>::value, int >::type >
void TIMPI::Communicator::map_sum ( Map &  data) const
inlineprivate

Private implementation function called by the map-based sum() specializations.

This is_fixed_type variant saves a communication by broadcasting pairs

Definition at line 2720 of file parallel_implementation.h.

References allgather(), and size().

2721 {
2722  if (this->size() > 1)
2723  {
2724  TIMPI_LOG_SCOPE("sum(map)", "Parallel");
2725 
2726  // There may be different keys on different processors, so we
2727  // first gather all the (key, value) pairs and then insert
2728  // them, summing repeated keys, back into the map.
2729  //
2730  // Note: We don't simply use Map::value_type here because the
2731  // key type is const in that case and we don't have the proper
2732  // StandardType overloads for communicating const types.
2733  std::vector<std::pair<typename Map::key_type, typename Map::mapped_type>>
2734  vecdata(data.begin(), data.end());
2735 
2736  this->allgather(vecdata, /*identical_buffer_sizes=*/false);
2737 
2738  data.clear();
2739  for (const auto & pr : vecdata)
2740  data[pr.first] += pr.second;
2741  }
2742 }
void allgather(const T &send_data, std::vector< T, A > &recv_data) const
Take a vector of length this->size(), and fill in recv[processor_id] = the value of send on that proc...
processor_id_type size() const
Definition: communicator.h:208

◆ map_sum() [2/2]

template<typename Map >
void TIMPI::Communicator::map_sum ( Map &  data) const
inlineprivate

Private implementation function called by the map-based sum() specializations.

This !is_fixed_type variant calls allgather twice: once for the keys and once for the values.

Definition at line 2752 of file parallel_implementation.h.

2753 {
2754  if (this->size() > 1)
2755  {
2756  TIMPI_LOG_SCOPE("sum(map)", "Parallel");
2757 
2758  // There may be different keys on different processors, so we
2759  // first gather all the (key, value) pairs and then insert
2760  // them, summing repeated keys, back into the map.
2761  std::vector<typename Map::key_type> keys;
2762  std::vector<typename Map::mapped_type> vals;
2763 
2764  auto data_size = data.size();
2765  keys.reserve(data_size);
2766  vals.reserve(data_size);
2767 
2768  for (const auto & pr : data)
2769  {
2770  keys.push_back(pr.first);
2771  vals.push_back(pr.second);
2772  }
2773 
2774  this->allgather(keys, /*identical_buffer_sizes=*/false);
2775  this->allgather(vals, /*identical_buffer_sizes=*/false);
2776 
2777  data.clear();
2778 
2779  for (std::size_t i=0; i<keys.size(); ++i)
2780  data[keys[i]] += vals[i];
2781  }
2782 }
void allgather(const T &send_data, std::vector< T, A > &recv_data) const
Take a vector of length this->size(), and fill in recv[processor_id] = the value of send on that proc...
processor_id_type size() const
Definition: communicator.h:208

◆ max() [1/4]

template<typename T >
void TIMPI::Communicator::max ( const T &  r,
T &  o,
Request req 
) const
inline

Non-blocking maximum of the local value r into o with the request req.

Definition at line 2322 of file parallel_implementation.h.

References TIMPI::Request::get(), TIMPI::Request::null_request, and size().

Referenced by allgather_packed_range(), TIMPI::detail::empty_send_assertion(), gather_packed_range(), get_unique_tag(), TIMPI::detail::push_parallel_roundrobin_helper(), semiverify(), testInfinityMax(), testMapMax(), testMax(), testMaxVecBool(), testNonblockingMax(), testNonFixedTypeMapMax(), and verify().

2325 {
2326  if (this->size() > 1)
2327  {
2328  TIMPI_LOG_SCOPE("max()", "Parallel");
2329 
2330  timpi_call_mpi
2331  (MPI_Iallreduce (&r, &o, 1,
2332  StandardType<T>(&r),
2333  OpFunction<T>::max(),
2334  this->get(),
2335  req.get()));
2336  }
2337  else
2338  {
2339  o = r;
2340  req = Request::null_request;
2341  }
2342 }
static const request null_request
Definition: request.h:111
processor_id_type size() const
Definition: communicator.h:208

◆ max() [2/4]

template<typename T >
void TIMPI::Communicator::max ( T &  r) const
inline

Take a local variable and replace it with the maximum of it's values on all processors.

Containers are replaced element-wise.

◆ max() [3/4]

template<typename T >
void TIMPI::Communicator::max ( T &  timpi_mpi_varr) const
inline

Definition at line 2346 of file parallel_implementation.h.

References size().

2347 {
2348  if (this->size() > 1)
2349  {
2350  TIMPI_LOG_SCOPE("max(scalar)", "Parallel");
2351 
2352  timpi_call_mpi
2353  (MPI_Allreduce (MPI_IN_PLACE, &r, 1,
2354  StandardType<T>(&r),
2355  OpFunction<T>::max(),
2356  this->get()));
2357  }
2358 }
processor_id_type size() const
Definition: communicator.h:208

◆ max() [4/4]

template<typename A >
void TIMPI::Communicator::max ( std::vector< bool, A > &  r) const
inline

Definition at line 2381 of file parallel_implementation.h.

References size(), and verify().

2382 {
2383  if (this->size() > 1 && !r.empty())
2384  {
2385  TIMPI_LOG_SCOPE("max(vector<bool>)", "Parallel");
2386 
2387  timpi_assert(this->verify(r.size()));
2388 
2389  std::vector<unsigned int> ruint;
2390  pack_vector_bool(r, ruint);
2391  std::vector<unsigned int> temp(ruint.size());
2392  timpi_call_mpi
2393  (MPI_Allreduce (ruint.data(), temp.data(),
2394  cast_int<int>(ruint.size()),
2395  StandardType<unsigned int>(), MPI_BOR,
2396  this->get()));
2397  unpack_vector_bool(temp, r);
2398  }
2399 }
bool verify(const T &r) const
Verify that a local variable has the same value on all processors.
processor_id_type size() const
Definition: communicator.h:208

◆ maxloc() [1/3]

template<typename T >
void TIMPI::Communicator::maxloc ( T &  r,
unsigned int &  max_id 
) const
inline

Take a local variable and replace it with the maximum of it's values on all processors, returning the minimum rank of a processor which originally held the maximum value.

Definition at line 2519 of file parallel_implementation.h.

References TIMPI::ignore(), TIMPI::DataPlusInt< T >::rank, rank(), size(), and TIMPI::DataPlusInt< T >::val.

Referenced by testMaxloc(), testMaxlocBool(), and testMaxlocDouble().

2521 {
2522  if (this->size() > 1)
2523  {
2524  TIMPI_LOG_SCOPE("maxloc(scalar)", "Parallel");
2525 
2526  DataPlusInt<T> data_in;
2527  ignore(data_in); // unused ifndef TIMPI_HAVE_MPI
2528  data_in.val = r;
2529  data_in.rank = this->rank();
2530 
2531  timpi_call_mpi
2532  (MPI_Allreduce (MPI_IN_PLACE, &data_in, 1,
2533  dataplusint_type_acquire<T>().first,
2534  OpFunction<T>::max_location(), this->get()));
2535  r = data_in.val;
2536  max_id = data_in.rank;
2537  }
2538  else
2539  max_id = this->rank();
2540 }
processor_id_type rank() const
Definition: communicator.h:206
void ignore(const Args &...)
Definition: timpi_assert.h:54
processor_id_type size() const
Definition: communicator.h:208

◆ maxloc() [2/3]

template<typename T , typename A1 , typename A2 >
void TIMPI::Communicator::maxloc ( std::vector< T, A1 > &  r,
std::vector< unsigned int, A2 > &  max_id 
) const
inline

Take a vector of local variables and replace each entry with the maximum of it's values on all processors.

Set each min_id entry to the minimum rank where a corresponding maximum was found.

Definition at line 2544 of file parallel_implementation.h.

References rank(), size(), and verify().

2546 {
2547  if (this->size() > 1 && !r.empty())
2548  {
2549  TIMPI_LOG_SCOPE("maxloc(vector)", "Parallel");
2550 
2551  timpi_assert(this->verify(r.size()));
2552 
2553  std::vector<DataPlusInt<T>> data_in(r.size());
2554  for (std::size_t i=0; i != r.size(); ++i)
2555  {
2556  data_in[i].val = r[i];
2557  data_in[i].rank = this->rank();
2558  }
2559  std::vector<DataPlusInt<T>> data_out(r.size());
2560 
2561  timpi_call_mpi
2562  (MPI_Allreduce (data_in.data(), data_out.data(),
2563  cast_int<int>(r.size()),
2564  dataplusint_type_acquire<T>().first,
2565  OpFunction<T>::max_location(),
2566  this->get()));
2567  for (std::size_t i=0; i != r.size(); ++i)
2568  {
2569  r[i] = data_out[i].val;
2570  max_id[i] = data_out[i].rank;
2571  }
2572  }
2573  else if (!r.empty())
2574  {
2575  for (std::size_t i=0; i != r.size(); ++i)
2576  max_id[i] = this->rank();
2577  }
2578 }
bool verify(const T &r) const
Verify that a local variable has the same value on all processors.
processor_id_type rank() const
Definition: communicator.h:206
processor_id_type size() const
Definition: communicator.h:208

◆ maxloc() [3/3]

template<typename A1 , typename A2 >
void TIMPI::Communicator::maxloc ( std::vector< bool, A1 > &  r,
std::vector< unsigned int, A2 > &  max_id 
) const
inline

Definition at line 2582 of file parallel_implementation.h.

References rank(), size(), and verify().

2584 {
2585  if (this->size() > 1 && !r.empty())
2586  {
2587  TIMPI_LOG_SCOPE("maxloc(vector<bool>)", "Parallel");
2588 
2589  timpi_assert(this->verify(r.size()));
2590 
2591  std::vector<DataPlusInt<int>> data_in(r.size());
2592  for (std::size_t i=0; i != r.size(); ++i)
2593  {
2594  data_in[i].val = r[i];
2595  data_in[i].rank = this->rank();
2596  }
2597  std::vector<DataPlusInt<int>> data_out(r.size());
2598  timpi_call_mpi
2599  (MPI_Allreduce (data_in.data(), data_out.data(),
2600  cast_int<int>(r.size()),
2601  StandardType<int>(),
2602  OpFunction<int>::max_location(),
2603  this->get()));
2604  for (std::size_t i=0; i != r.size(); ++i)
2605  {
2606  r[i] = data_out[i].val;
2607  max_id[i] = data_out[i].rank;
2608  }
2609  }
2610  else if (!r.empty())
2611  {
2612  for (std::size_t i=0; i != r.size(); ++i)
2613  max_id[i] = this->rank();
2614  }
2615 }
bool verify(const T &r) const
Verify that a local variable has the same value on all processors.
processor_id_type rank() const
Definition: communicator.h:206
processor_id_type size() const
Definition: communicator.h:208

◆ min() [1/4]

template<typename T >
void TIMPI::Communicator::min ( const T &  r,
T &  o,
Request req 
) const
inline

Non-blocking minimum of the local value r into o with the request req.

Definition at line 2142 of file parallel_implementation.h.

References TIMPI::Request::get(), TIMPI::Request::null_request, and size().

Referenced by semiverify(), testInfinityMin(), testMin(), testMinLarge(), testMinVecBool(), testNonblockingMin(), testNonblockingTest(), testNonblockingWaitany(), and verify().

2145 {
2146  if (this->size() > 1)
2147  {
2148  TIMPI_LOG_SCOPE("min()", "Parallel");
2149 
2150  timpi_call_mpi
2151  (MPI_Iallreduce (&r, &o, 1,
2152  StandardType<T>(&r),
2153  OpFunction<T>::min(),
2154  this->get(),
2155  req.get()));
2156  }
2157  else
2158  {
2159  o = r;
2160  req = Request::null_request;
2161  }
2162 }
static const request null_request
Definition: request.h:111
processor_id_type size() const
Definition: communicator.h:208

◆ min() [2/4]

template<typename T >
void TIMPI::Communicator::min ( T &  r) const
inline

Take a local variable and replace it with the minimum of it's values on all processors.

Containers are replaced element-wise.

◆ min() [3/4]

template<typename T >
void TIMPI::Communicator::min ( T &  timpi_mpi_varr) const
inline

Definition at line 2167 of file parallel_implementation.h.

References size().

2168 {
2169  if (this->size() > 1)
2170  {
2171  TIMPI_LOG_SCOPE("min(scalar)", "Parallel");
2172 
2173  timpi_call_mpi
2174  (MPI_Allreduce (MPI_IN_PLACE, &r, 1,
2175  StandardType<T>(&r), OpFunction<T>::min(),
2176  this->get()));
2177  }
2178 }
processor_id_type size() const
Definition: communicator.h:208

◆ min() [4/4]

template<typename A >
void TIMPI::Communicator::min ( std::vector< bool, A > &  r) const
inline

Definition at line 2202 of file parallel_implementation.h.

References size(), and verify().

2203 {
2204  if (this->size() > 1 && !r.empty())
2205  {
2206  TIMPI_LOG_SCOPE("min(vector<bool>)", "Parallel");
2207 
2208  timpi_assert(this->verify(r.size()));
2209 
2210  std::vector<unsigned int> ruint;
2211  pack_vector_bool(r, ruint);
2212  std::vector<unsigned int> temp(ruint.size());
2213  timpi_call_mpi
2214  (MPI_Allreduce (ruint.data(), temp.data(),
2215  cast_int<int>(ruint.size()),
2216  StandardType<unsigned int>(), MPI_BAND,
2217  this->get()));
2218  unpack_vector_bool(temp, r);
2219  }
2220 }
bool verify(const T &r) const
Verify that a local variable has the same value on all processors.
processor_id_type size() const
Definition: communicator.h:208

◆ minloc() [1/3]

template<typename T >
void TIMPI::Communicator::minloc ( T &  r,
unsigned int &  min_id 
) const
inline

Take a local variable and replace it with the minimum of it's values on all processors, returning the minimum rank of a processor which originally held the minimum value.

Definition at line 2224 of file parallel_implementation.h.

References TIMPI::ignore(), TIMPI::DataPlusInt< T >::rank, rank(), size(), and TIMPI::DataPlusInt< T >::val.

Referenced by testMinloc(), testMinlocBool(), and testMinlocDouble().

2226 {
2227  if (this->size() > 1)
2228  {
2229  TIMPI_LOG_SCOPE("minloc(scalar)", "Parallel");
2230 
2231  DataPlusInt<T> data_in;
2232  ignore(data_in); // unused ifndef TIMPI_HAVE_MPI
2233  data_in.val = r;
2234  data_in.rank = this->rank();
2235 
2236  timpi_call_mpi
2237  (MPI_Allreduce (MPI_IN_PLACE, &data_in, 1,
2238  dataplusint_type_acquire<T>().first,
2239  OpFunction<T>::min_location(), this->get()));
2240  r = data_in.val;
2241  min_id = data_in.rank;
2242  }
2243  else
2244  min_id = this->rank();
2245 }
processor_id_type rank() const
Definition: communicator.h:206
void ignore(const Args &...)
Definition: timpi_assert.h:54
processor_id_type size() const
Definition: communicator.h:208

◆ minloc() [2/3]

template<typename T , typename A1 , typename A2 >
void TIMPI::Communicator::minloc ( std::vector< T, A1 > &  r,
std::vector< unsigned int, A2 > &  min_id 
) const
inline

Take a vector of local variables and replace each entry with the minimum of it's values on all processors.

Set each min_id entry to the minimum rank where a corresponding minimum was found.

Definition at line 2249 of file parallel_implementation.h.

References rank(), size(), and verify().

2251 {
2252  if (this->size() > 1 && !r.empty())
2253  {
2254  TIMPI_LOG_SCOPE("minloc(vector)", "Parallel");
2255 
2256  timpi_assert(this->verify(r.size()));
2257 
2258  std::vector<DataPlusInt<T>> data_in(r.size());
2259  for (std::size_t i=0; i != r.size(); ++i)
2260  {
2261  data_in[i].val = r[i];
2262  data_in[i].rank = this->rank();
2263  }
2264  std::vector<DataPlusInt<T>> data_out(r.size());
2265 
2266  timpi_call_mpi
2267  (MPI_Allreduce (data_in.data(), data_out.data(),
2268  cast_int<int>(r.size()),
2269  dataplusint_type_acquire<T>().first,
2270  OpFunction<T>::min_location(), this->get()));
2271  for (std::size_t i=0; i != r.size(); ++i)
2272  {
2273  r[i] = data_out[i].val;
2274  min_id[i] = data_out[i].rank;
2275  }
2276  }
2277  else if (!r.empty())
2278  {
2279  for (std::size_t i=0; i != r.size(); ++i)
2280  min_id[i] = this->rank();
2281  }
2282 }
bool verify(const T &r) const
Verify that a local variable has the same value on all processors.
processor_id_type rank() const
Definition: communicator.h:206
processor_id_type size() const
Definition: communicator.h:208

◆ minloc() [3/3]

template<typename A1 , typename A2 >
void TIMPI::Communicator::minloc ( std::vector< bool, A1 > &  r,
std::vector< unsigned int, A2 > &  min_id 
) const
inline

Definition at line 2286 of file parallel_implementation.h.

References rank(), size(), and verify().

2288 {
2289  if (this->size() > 1 && !r.empty())
2290  {
2291  TIMPI_LOG_SCOPE("minloc(vector<bool>)", "Parallel");
2292 
2293  timpi_assert(this->verify(r.size()));
2294 
2295  std::vector<DataPlusInt<int>> data_in(r.size());
2296  for (std::size_t i=0; i != r.size(); ++i)
2297  {
2298  data_in[i].val = r[i];
2299  data_in[i].rank = this->rank();
2300  }
2301  std::vector<DataPlusInt<int>> data_out(r.size());
2302  timpi_call_mpi
2303  (MPI_Allreduce (data_in.data(), data_out.data(),
2304  cast_int<int>(r.size()),
2305  StandardType<int>(),
2306  OpFunction<int>::min_location(), this->get()));
2307  for (std::size_t i=0; i != r.size(); ++i)
2308  {
2309  r[i] = data_out[i].val;
2310  min_id[i] = data_out[i].rank;
2311  }
2312  }
2313  else if (!r.empty())
2314  {
2315  for (std::size_t i=0; i != r.size(); ++i)
2316  min_id[i] = this->rank();
2317  }
2318 }
bool verify(const T &r) const
Verify that a local variable has the same value on all processors.
processor_id_type rank() const
Definition: communicator.h:206
processor_id_type size() const
Definition: communicator.h:208

◆ nonblocking_barrier()

void TIMPI::Communicator::nonblocking_barrier ( Request req) const

Start a barrier that doesn't block.

Definition at line 238 of file communicator.C.

References TIMPI::Request::get(), and size().

Referenced by TIMPI::detail::push_parallel_nbx_helper().

239 {
240  if (this->size() > 1)
241  {
242  TIMPI_LOG_SCOPE("nonblocking_barrier()", "Communicator");
243  timpi_call_mpi(MPI_Ibarrier (this->get(), req.get()));
244  }
245 }
processor_id_type size() const
Definition: communicator.h:208

◆ nonblocking_receive_packed_range() [1/3]

template<typename Context , typename OutputIter , typename T >
void TIMPI::Communicator::nonblocking_receive_packed_range ( const unsigned int  src_processor_id,
Context *  context,
OutputIter  out,
const T *  output_type,
Request req,
Status stat,
const MessageTag tag = any_tag 
) const
inline

Non-Blocking-receive range-of-pointers from one processor.

This is meant to receive messages from nonblocking_send_packed_range

Similar in design to the above receive_packed_range. However, this version requires a Request and a Status.

The Status must be a positively tested Status for a message of this type (i.e. a message does exist). It should most likely be generated by Communicator::packed_range_probe.

Definition at line 1255 of file parallel_implementation.h.

References TIMPI::Request::add_post_wait_work(), receive(), and TIMPI::Status::size().

Referenced by possibly_receive_packed_range(), and TIMPI::push_parallel_packed_range().

1262 {
1263  typedef typename Packing<T>::buffer_type buffer_t;
1264 
1265  // Receive serialized variable size objects as a sequence of
1266  // buffer_t.
1267  // Allocate a buffer on the heap so we don't have to free it until
1268  // after the Request::wait()
1269  std::vector<buffer_t> * buffer = new std::vector<buffer_t>(stat.size());
1270  this->receive(src_processor_id, *buffer, req, tag);
1271 
1272  // Make the Request::wait() handle unpacking the buffer
1273  req.add_post_wait_work
1274  (new PostWaitUnpackBuffer<std::vector<buffer_t>, Context, OutputIter, T>(*buffer, context, out));
1275 
1276  // Make the Request::wait() then handle deleting the buffer
1277  req.add_post_wait_work
1278  (new PostWaitDeleteBuffer<std::vector<buffer_t>>(buffer));
1279 
1280  // The MessageTag should stay registered for the Request lifetime
1281  req.add_post_wait_work
1282  (new PostWaitDereferenceTag(tag));
1283 }
Status receive(const unsigned int dest_processor_id, T &buf, const MessageTag &tag=any_tag) const
Blocking-receive from one processor with data-defined type.

◆ nonblocking_receive_packed_range() [2/3]

template<typename Context , typename OutputIter , typename T >
void TIMPI::Communicator::nonblocking_receive_packed_range ( const unsigned int  src_processor_id,
Context *  context,
OutputIter  out,
const T *  output_type,
Request req,
Status stat,
std::shared_ptr< std::vector< typename TIMPI::Packing< T >::buffer_type >> &  buffer,
const MessageTag tag = any_tag 
) const
inline

Non-Blocking-receive range-of-pointers from one processor.

This is meant to receive messages from nonblocking_send_packed_range

Similar in design to the above receive_packed_range. However, this version requires a Request and a Status.

The Status must be a positively tested Status for a message of this type (i.e. a message does exist). It should most likely be generated by Communicator::packed_range_probe.

◆ nonblocking_receive_packed_range() [3/3]

template<typename Context , typename OutputIter , typename T >
void TIMPI::Communicator::nonblocking_receive_packed_range ( const unsigned int  src_processor_id,
Context *  context,
OutputIter  out,
const T *  ,
Request req,
Status stat,
std::shared_ptr< std::vector< typename Packing< T >::buffer_type >> &  buffer,
const MessageTag tag 
) const
inline

Definition at line 1893 of file parallel_implementation.h.

References TIMPI::Request::add_post_wait_work(), receive(), and TIMPI::Status::size().

1901 {
1902  // If they didn't pass in a buffer - let's make one
1903  if (buffer == nullptr)
1904  buffer = std::make_shared<std::vector<typename Packing<T>::buffer_type>>();
1905  else
1906  buffer->clear();
1907 
1908  // Receive serialized variable size objects as a sequence of
1909  // buffer_t.
1910  // Allocate a buffer on the heap so we don't have to free it until
1911  // after the Request::wait()
1912  buffer->resize(stat.size());
1913  this->receive(src_processor_id, *buffer, req, tag);
1914 
1915  // Make the Request::wait() handle unpacking the buffer
1916  req.add_post_wait_work
1917  (new PostWaitUnpackBuffer<std::vector<typename Packing<T>::buffer_type>, Context, OutputIter, T>(*buffer, context, out));
1918 
1919  // Make it dereference the shared pointer (possibly freeing the buffer)
1920  req.add_post_wait_work
1921  (new PostWaitDereferenceSharedPtr<std::vector<typename Packing<T>::buffer_type>>(buffer));
1922 }
Status receive(const unsigned int dest_processor_id, T &buf, const MessageTag &tag=any_tag) const
Blocking-receive from one processor with data-defined type.

◆ nonblocking_send_packed_range() [1/3]

template<typename Context , typename Iter >
void TIMPI::Communicator::nonblocking_send_packed_range ( const unsigned int  dest_processor_id,
const Context *  context,
Iter  range_begin,
const Iter  range_end,
Request req,
const MessageTag tag = no_tag 
) const
inline

Similar to the above Nonblocking send_packed_range with a few important differences:

  1. The total size of the packed buffer MUST be less than std::numeric_limits<int>::max()
  2. Only one message is generated
  3. On the receiving end the message should be tested for using Communicator::packed_range_probe()
  4. The message must be received by Communicator::nonblocking_receive_packed_range()

Definition at line 750 of file parallel_implementation.h.

References TIMPI::Request::add_post_wait_work(), TIMPI::pack_range(), and send().

Referenced by TIMPI::push_parallel_packed_range().

756 {
757  // Allocate a buffer on the heap so we don't have to free it until
758  // after the Request::wait()
759  typedef typename std::iterator_traits<Iter>::value_type T;
760  typedef typename Packing<T>::buffer_type buffer_t;
761 
762  if (range_begin != range_end)
763  {
764  std::vector<buffer_t> * buffer = new std::vector<buffer_t>();
765 
766  range_begin =
767  pack_range(context,
768  range_begin,
769  range_end,
770  *buffer,
771  // MPI-2 can only use integers for size
772  std::numeric_limits<int>::max());
773 
774  if (range_begin != range_end)
775  timpi_error_msg("Non-blocking packed range sends cannot exceed " << std::numeric_limits<int>::max() << "in size");
776 
777  // Make the Request::wait() handle deleting the buffer
778  req.add_post_wait_work
779  (new PostWaitDeleteBuffer<std::vector<buffer_t>>
780  (buffer));
781 
782  // Non-blocking send of the buffer
783  this->send(dest_processor_id, *buffer, req, tag);
784  }
785 }
Iter pack_range(const Context *context, Iter range_begin, const Iter range_end, std::vector< buffertype > &buffer, std::size_t approx_buffer_size)
Helper function for range packing.
Definition: packing.h:1037
void send(const unsigned int dest_processor_id, const T &buf, const MessageTag &tag=no_tag) const
Blocking-send to one processor with data-defined type.

◆ nonblocking_send_packed_range() [2/3]

template<typename Context , typename Iter >
void TIMPI::Communicator::nonblocking_send_packed_range ( const unsigned int  dest_processor_id,
const Context *  context,
Iter  range_begin,
const Iter  range_end,
Request req,
std::shared_ptr< std::vector< typename TIMPI::Packing< typename std::iterator_traits< Iter >::value_type >::buffer_type >> &  buffer,
const MessageTag tag = no_tag 
) const
inline

Similar to the above Nonblocking send_packed_range with a few important differences:

  1. The total size of the packed buffer MUST be less than std::numeric_limits<int>::max()
  2. Only one message is generated
  3. On the receiving end the message should be tested for using Communicator::packed_range_probe()
  4. The message must be received by Communicator::nonblocking_receive_packed_range()

◆ nonblocking_send_packed_range() [3/3]

template<typename Context , typename Iter >
void TIMPI::Communicator::nonblocking_send_packed_range ( const unsigned int  dest_processor_id,
const Context *  context,
Iter  range_begin,
const Iter  range_end,
Request req,
std::shared_ptr< std::vector< typename Packing< typename std::iterator_traits< Iter >::value_type >::buffer_type >> &  buffer,
const MessageTag tag 
) const
inline

Definition at line 1543 of file parallel_implementation.h.

References TIMPI::Request::add_post_wait_work(), TIMPI::pack_range(), and send().

1550 {
1551  // Allocate a buffer on the heap so we don't have to free it until
1552  // after the Request::wait()
1553  typedef typename std::iterator_traits<Iter>::value_type T;
1554  typedef typename Packing<T>::buffer_type buffer_t;
1555 
1556  if (range_begin != range_end)
1557  {
1558  if (buffer == nullptr)
1559  buffer = std::make_shared<std::vector<buffer_t>>();
1560  else
1561  buffer->clear();
1562 
1563  range_begin =
1564  pack_range(context,
1565  range_begin,
1566  range_end,
1567  *buffer,
1568  // MPI-2 can only use integers for size
1569  std::numeric_limits<int>::max());
1570 
1571  if (range_begin != range_end)
1572  timpi_error_msg("Non-blocking packed range sends cannot exceed " << std::numeric_limits<int>::max() << "in size");
1573 
1574  // Make it dereference the shared pointer (possibly freeing the buffer)
1575  req.add_post_wait_work
1576  (new PostWaitDereferenceSharedPtr<std::vector<buffer_t>>(buffer));
1577 
1578  // Non-blocking send of the buffer
1579  this->send(dest_processor_id, *buffer, req, tag);
1580  }
1581 }
Iter pack_range(const Context *context, Iter range_begin, const Iter range_end, std::vector< buffertype > &buffer, std::size_t approx_buffer_size)
Helper function for range packing.
Definition: packing.h:1037
void send(const unsigned int dest_processor_id, const T &buf, const MessageTag &tag=no_tag) const
Blocking-send to one processor with data-defined type.

◆ operator=() [1/3]

Communicator& TIMPI::Communicator::operator= ( const Communicator )
delete

◆ operator=() [2/3]

Communicator& TIMPI::Communicator::operator= ( Communicator &&  )
default

◆ operator=() [3/3]

Communicator & TIMPI::Communicator::operator= ( const communicator comm)

Definition at line 177 of file communicator.C.

References assign(), and clear().

178 {
179  this->clear();
180  this->assign(comm);
181  return *this;
182 }
void clear()
Free and reset this communicator.
Definition: communicator.C:162
void assign(const communicator &comm)
Utility function for setting our member variables from an MPI communicator.
Definition: communicator.C:185

◆ packed_range_probe()

template<typename T >
Status TIMPI::Communicator::packed_range_probe ( const unsigned int  src_processor_id,
const MessageTag tag,
bool &  flag 
) const
inline

Non-Blocking message probe for a packed range message.

Allows information about a message to be examined before the message is actually received.

Template type must match the object type that will be in the packed range

Parameters
src_processor_idThe processor the message is expected from or TIMPI::any_source
tagThe message tag or TIMPI::any_tag
flagOutput. True if a message exists. False otherwise.

Definition at line 4040 of file parallel_implementation.h.

References TIMPI::any_source, TIMPI::Status::get(), TIMPI::ignore(), size(), and TIMPI::MessageTag::value().

Referenced by TIMPI::push_parallel_packed_range().

4043 {
4044  TIMPI_LOG_SCOPE("packed_range_probe()", "Parallel");
4045 
4046  ignore(src_processor_id, tag); // unused in opt mode w/o MPI
4047 
4048  Status stat((StandardType<typename Packing<T>::buffer_type>()));
4049 
4050  int int_flag = 0;
4051 
4052  timpi_assert(src_processor_id < this->size() ||
4053  src_processor_id == any_source);
4054 
4055  timpi_call_mpi(MPI_Iprobe(int(src_processor_id),
4056  tag.value(),
4057  this->get(),
4058  &int_flag,
4059  stat.get()));
4060 
4061  flag = int_flag;
4062 
4063  return stat;
4064 }
void ignore(const Args &...)
Definition: timpi_assert.h:54
const unsigned int any_source
Processor id meaning "Accept from any source".
Definition: communicator.h:82
processor_id_type size() const
Definition: communicator.h:208

◆ packed_size_of()

template<typename T , typename A1 , typename A2 >
int TIMPI::Communicator::packed_size_of ( const std::vector< std::vector< T, A1 >, A2 > &  buf,
const DataType type 
) const
private

Definition at line 222 of file parallel_implementation.h.

References size().

224 {
225  // figure out how many bytes we need to pack all the data
226  int packedsize=0;
227 
228  // The outer buffer size
229  timpi_call_mpi
230  (MPI_Pack_size (1,
231  StandardType<unsigned int>(),
232  this->get(),
233  &packedsize));
234 
235  int sendsize = packedsize;
236 
237  const std::size_t n_vecs = buf.size();
238 
239  for (std::size_t i = 0; i != n_vecs; ++i)
240  {
241  // The size of the ith inner buffer
242  timpi_call_mpi
243  (MPI_Pack_size (1,
244  StandardType<unsigned int>(),
245  this->get(),
246  &packedsize));
247 
248  sendsize += packedsize;
249 
250  // The data for each inner buffer
251  timpi_call_mpi
252  (MPI_Pack_size (cast_int<int>(buf[i].size()), type,
253  this->get(), &packedsize));
254 
255  sendsize += packedsize;
256  }
257 
258  timpi_assert (sendsize /* should at least be 1! */);
259  return sendsize;
260 }
processor_id_type size() const
Definition: communicator.h:208

◆ possibly_receive() [1/6]

template<typename T , typename A , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type >
bool TIMPI::Communicator::possibly_receive ( unsigned int &  src_processor_id,
std::vector< T, A > &  buf,
Request req,
const MessageTag tag 
) const
inline

Nonblocking-receive from one processor with user-defined type.

Checks to see if a message can be received from the src_processor_id . If so, it starts a non-blocking receive using the passed in request and returns true

Otherwise - if there is no message to receive it returns false

Note: The buf does NOT need to be properly sized before this call this will resize the buffer automatically

Parameters
src_processor_idThe pid to receive from or "any". will be set to the actual src being received from
bufThe buffer to receive into
reqThe request to use
tagThe tag to use

Definition at line 4070 of file parallel_implementation.h.

Referenced by possibly_receive(), and TIMPI::push_parallel_vector_data().

4074 {
4075  T * dataptr = buf.empty() ? nullptr : buf.data();
4076 
4077  return this->possibly_receive(src_processor_id, buf, StandardType<T>(dataptr), req, tag);
4078 }
bool possibly_receive(unsigned int &src_processor_id, std::vector< T, A > &buf, Request &req, const MessageTag &tag) const
Nonblocking-receive from one processor with user-defined type.

◆ possibly_receive() [2/6]

template<typename T , typename A , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type = 0>
bool TIMPI::Communicator::possibly_receive ( unsigned int &  src_processor_id,
std::vector< T, A > &  buf,
Request req,
const MessageTag tag 
) const
inline

dispatches to possibly_receive_packed_range

Parameters
src_processor_idThe pid to receive from or "any". will be set to the actual src being received from
bufThe buffer to receive into
reqThe request to use
tagThe tag to use

◆ possibly_receive() [3/6]

template<typename T , typename A , typename std::enable_if< std::is_base_of< DataType, StandardType< T >>::value, int >::type >
bool TIMPI::Communicator::possibly_receive ( unsigned int &  src_processor_id,
std::vector< T, A > &  buf,
const DataType type,
Request req,
const MessageTag tag 
) const
inline

Nonblocking-receive from one processor with user-defined type.

As above, but with manually-specified data type.

Parameters
src_processor_idThe pid to receive from or "any". will be set to the actual src being received from
bufThe buffer to receive into
typeThe intrinsic datatype to receive
reqThe request to use
tagThe tag to use

Definition at line 1927 of file parallel_implementation.h.

References TIMPI::Request::add_post_wait_work(), TIMPI::any_source, TIMPI::Status::get(), TIMPI::Request::get(), TIMPI::Status::size(), size(), TIMPI::Status::source(), and TIMPI::MessageTag::value().

1932 {
1933  TIMPI_LOG_SCOPE("possibly_receive()", "Parallel");
1934 
1935  Status stat(type);
1936 
1937  int int_flag = 0;
1938 
1939  timpi_assert(src_processor_id < this->size() ||
1940  src_processor_id == any_source);
1941 
1942  timpi_call_mpi(MPI_Iprobe(int(src_processor_id),
1943  tag.value(),
1944  this->get(),
1945  &int_flag,
1946  stat.get()));
1947 
1948  if (int_flag)
1949  {
1950  buf.resize(stat.size());
1951 
1952  src_processor_id = stat.source();
1953 
1954  timpi_call_mpi
1955  (MPI_Irecv (buf.data(),
1956  cast_int<int>(buf.size()),
1957  type,
1958  src_processor_id,
1959  tag.value(),
1960  this->get(),
1961  req.get()));
1962 
1963  // The MessageTag should stay registered for the Request lifetime
1964  req.add_post_wait_work
1965  (new PostWaitDereferenceTag(tag));
1966  }
1967 
1968  return int_flag;
1969 }
const unsigned int any_source
Processor id meaning "Accept from any source".
Definition: communicator.h:82
processor_id_type size() const
Definition: communicator.h:208

◆ possibly_receive() [4/6]

template<typename T , typename A , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type >
bool TIMPI::Communicator::possibly_receive ( unsigned int &  src_processor_id,
std::vector< T, A > &  buf,
const NotADataType type,
Request req,
const MessageTag tag 
) const
inline

Nonblocking-receive from one processor with user-defined type.

Dispatches to possibly_receive_packed_range

Parameters
src_processor_idThe pid to receive from or "any". will be set to the actual src being received from
bufThe buffer to receive into
typeThe packable type to receive
reqThe request to use
tagThe tag to use

Definition at line 1972 of file parallel_implementation.h.

References possibly_receive_packed_range().

1977 {
1978  TIMPI_LOG_SCOPE("possibly_receive()", "Parallel");
1979 
1980  return this->possibly_receive_packed_range(src_processor_id,
1981  (void *)(nullptr),
1982  std::inserter(buf, buf.end()),
1983  (T *)(nullptr),
1984  req,
1985  tag);
1986 }
bool possibly_receive_packed_range(unsigned int &src_processor_id, Context *context, OutputIter out, const T *output_type, Request &req, const MessageTag &tag) const
Nonblocking packed range receive from one processor with user-defined type.

◆ possibly_receive() [5/6]

template<typename T , typename A1 , typename A2 >
bool TIMPI::Communicator::possibly_receive ( unsigned int &  src_processor_id,
std::vector< std::vector< T, A1 >, A2 > &  buf,
const DataType type,
Request req,
const MessageTag tag 
) const
inline

Definition at line 1991 of file parallel_implementation.h.

References TIMPI::Request::add_post_wait_work(), TIMPI::any_source, TIMPI::Status::get(), receive(), TIMPI::Status::size(), size(), TIMPI::Status::source(), and TIMPI::MessageTag::value().

1996 {
1997  TIMPI_LOG_SCOPE("possibly_receive()", "Parallel");
1998 
1999  Status stat(type);
2000 
2001  int int_flag = 0;
2002 
2003  timpi_assert(src_processor_id < this->size() ||
2004  src_processor_id == any_source);
2005 
2006  timpi_call_mpi(MPI_Iprobe(int(src_processor_id),
2007  tag.value(),
2008  this->get(),
2009  &int_flag,
2010  stat.get()));
2011 
2012  if (int_flag)
2013  {
2014  src_processor_id = stat.source();
2015 
2016  std::vector<char> * recvbuf =
2017  new std::vector<char>(stat.size(StandardType<char>()));
2018 
2019  this->receive(src_processor_id, *recvbuf, MPI_PACKED, req, tag);
2020 
2021  // When we wait on the receive, we'll unpack the temporary buffer
2022  req.add_post_wait_work
2023  (new PostWaitUnpackNestedBuffer<std::vector<std::vector<T,A1>,A2>>
2024  (*recvbuf, buf, type, *this));
2025 
2026  // And then we'll free the temporary buffer
2027  req.add_post_wait_work
2028  (new PostWaitDeleteBuffer<std::vector<char>>(recvbuf));
2029 
2030  // The MessageTag should stay registered for the Request lifetime
2031  req.add_post_wait_work
2032  (new PostWaitDereferenceTag(tag));
2033  }
2034 
2035  return int_flag;
2036 }
const unsigned int any_source
Processor id meaning "Accept from any source".
Definition: communicator.h:82
processor_id_type size() const
Definition: communicator.h:208
Status receive(const unsigned int dest_processor_id, T &buf, const MessageTag &tag=any_tag) const
Blocking-receive from one processor with data-defined type.

◆ possibly_receive() [6/6]

template<typename T , typename A1 , typename A2 >
bool TIMPI::Communicator::possibly_receive ( unsigned int &  src_processor_id,
std::vector< std::vector< T, A1 >, A2 > &  buf,
Request req,
const MessageTag tag 
) const
inline

Definition at line 4098 of file parallel_implementation.h.

References possibly_receive().

4102 {
4103  T * dataptr = buf.empty() ? nullptr : (buf[0].empty() ? nullptr : buf[0].data());
4104 
4105  return this->possibly_receive(src_processor_id, buf, StandardType<T>(dataptr), req, tag);
4106 }
bool possibly_receive(unsigned int &src_processor_id, std::vector< T, A > &buf, Request &req, const MessageTag &tag) const
Nonblocking-receive from one processor with user-defined type.

◆ possibly_receive_packed_range()

template<typename Context , typename OutputIter , typename T >
bool TIMPI::Communicator::possibly_receive_packed_range ( unsigned int &  src_processor_id,
Context *  context,
OutputIter  out,
const T *  output_type,
Request req,
const MessageTag tag 
) const
inline

Nonblocking packed range receive from one processor with user-defined type.

Checks to see if a message can be received from the src_processor_id . If so, it starts a nonblocking packed range receive using the passed in request and returns true

Otherwise - if there is no message to receive it returns false

void Parallel::unpack(const T *, OutputIter data, const Context *) is used to unserialize type T

Parameters
src_processor_idThe pid to receive from or "any". will be set to the actual src being receieved from
contextContext pointer that will be passed into the unpack functions
outThe output iterator
output_typeThe intrinsic datatype to receive
reqThe request to use
tagThe tag to use

Definition at line 4110 of file parallel_implementation.h.

References TIMPI::Request::add_post_wait_work(), TIMPI::any_source, nonblocking_receive_packed_range(), and size().

Referenced by possibly_receive(), and TIMPI::push_parallel_packed_range().

4116 {
4117  TIMPI_LOG_SCOPE("possibly_receive_packed_range()", "Parallel");
4118 
4119  bool int_flag = 0;
4120 
4121  auto stat = packed_range_probe<T>(src_processor_id, tag, int_flag);
4122 
4123  if (int_flag)
4124  {
4125  src_processor_id = stat.source();
4126 
4127  nonblocking_receive_packed_range(src_processor_id,
4128  context,
4129  out,
4130  type,
4131  req,
4132  stat,
4133  tag);
4134 
4135  // The MessageTag should stay registered for the Request lifetime
4136  req.add_post_wait_work
4137  (new PostWaitDereferenceTag(tag));
4138  }
4139 
4140  timpi_assert(!int_flag || (int_flag &&
4141  src_processor_id < this->size() &&
4142  src_processor_id != any_source));
4143 
4144  return int_flag;
4145 }
void nonblocking_receive_packed_range(const unsigned int src_processor_id, Context *context, OutputIter out, const T *output_type, Request &req, Status &stat, const MessageTag &tag=any_tag) const
Non-Blocking-receive range-of-pointers from one processor.
const unsigned int any_source
Processor id meaning "Accept from any source".
Definition: communicator.h:82
processor_id_type size() const
Definition: communicator.h:208

◆ probe()

status TIMPI::Communicator::probe ( const unsigned int  src_processor_id,
const MessageTag tag = any_tag 
) const

Blocking message probe.

Allows information about a message to be examined before the message is actually received.

Definition at line 283 of file communicator.C.

References TIMPI::any_source, TIMPI::ignore(), size(), and TIMPI::MessageTag::value().

Referenced by TIMPI::pull_parallel_vector_data(), TIMPI::detail::push_parallel_alltoall_helper(), and receive().

285 {
286  TIMPI_LOG_SCOPE("probe()", "Communicator");
287 
288 #ifndef TIMPI_HAVE_MPI
289  timpi_not_implemented();
290  ignore(src_processor_id, tag);
291 #endif
292 
293  status stat;
294 
295  timpi_assert(src_processor_id < this->size() ||
296  src_processor_id == any_source);
297 
298  timpi_call_mpi
299  (MPI_Probe (int(src_processor_id), tag.value(), this->get(), &stat));
300 
301  return stat;
302 }
void ignore(const Args &...)
Definition: timpi_assert.h:54
const unsigned int any_source
Processor id meaning "Accept from any source".
Definition: communicator.h:82
MPI_Status status
Status object for querying messages.
Definition: status.h:44
processor_id_type size() const
Definition: communicator.h:208

◆ rank()

processor_id_type TIMPI::Communicator::rank ( ) const
inline

Definition at line 206 of file communicator.h.

References _rank.

Referenced by broadcast(), broadcast_packed_range(), TIMPI::detail::empty_send_assertion(), fill_data(), fill_scalar_data(), fill_vector_data(), gather(), map_broadcast(), maxloc(), minloc(), TIMPI::pull_parallel_vector_data(), TIMPI::detail::push_parallel_alltoall_helper(), TIMPI::detail::push_parallel_nbx_helper(), TIMPI::detail::push_parallel_roundrobin_helper(), TIMPI::report_here(), scatter(), send_receive(), send_receive_packed_range(), testAllGather(), testAllGatherHalfEmptyVectorString(), testAllGatherString(), testAllGatherVectorString(), testAllGatherVectorVector(), testAllGatherVectorVectorPacked(), testArrayContainerAllGather(), testBigUnion(), testBroadcast(), testBroadcastArrayType(), testBroadcastNestedType(), testBroadcastString(), testContainerAllGather(), testContainerBroadcast(), testContainerSendReceive(), testEmptyEntry(), testGather(), testGatherString(), testGatherString2(), testGettableStringAllGather(), testIrecvSend(), testIsendRecv(), testMapContainerAllGather(), testMapMap(), testMapMax(), testMapSet(), testMax(), testMaxloc(), testMaxlocBool(), testMaxlocDouble(), testMaxVecBool(), testMin(), testMinLarge(), testMinloc(), testMinlocBool(), testMinlocDouble(), testMinVecBool(), testMPIULongMin(), testNestingAllGather(), testNonblockingMax(), testNonblockingMin(), testNonblockingSum(), testNonblockingTest(), testNonblockingWaitany(), testNonFixedTypeMapMax(), testNonFixedTypeSum(), testNullAllGather(), testNullSendReceive(), testPackedSetUnion(), testPairContainerAllGather(), testPush(), testPushImpl(), testPushMove(), testPushMultimapImpl(), testPushMultimapVecVecImpl(), testPushPackedImpl(), testPushPackedImplMove(), testPushPackedNestedImpl(), testPushVecVecImpl(), testRecvIsendSets(), testRecvIsendVecVecs(), testScatter(), testSemiVerifyInf(), testSemiVerifyString(), testSemiVerifyType(), testSemiVerifyVector(), testSendRecvVecVecs(), testSplit(), testSplitByType(), testSum(), testSumOpFunction(), testTupleContainerAllGather(), testTupleStringAllGather(), testUnion(), testVectorOfContainersAllGather(), and testVectorOfContainersBroadcast().

206 { return _rank; }
processor_id_type _rank
Definition: communicator.h:230

◆ receive() [1/6]

template<typename T >
Status TIMPI::Communicator::receive ( const unsigned int  dest_processor_id,
T &  buf,
const MessageTag tag = any_tag 
) const
inline

Blocking-receive from one processor with data-defined type.

We do not currently support receives on one processor without MPI.

Definition at line 833 of file parallel_implementation.h.

References TIMPI::any_source, TIMPI::Status::get(), probe(), size(), and TIMPI::MessageTag::value().

Referenced by nonblocking_receive_packed_range(), possibly_receive(), TIMPI::pull_parallel_vector_data(), TIMPI::push_parallel_vector_data(), receive(), receive_packed_range(), testIrecvSend(), testIsendRecv(), testRecvIsendSets(), testRecvIsendVecVecs(), and testSendRecvVecVecs().

836 {
837  TIMPI_LOG_SCOPE("receive()", "Parallel");
838 
839  // Get the status of the message, explicitly provide the
840  // datatype so we can later query the size
841  Status stat(this->probe(src_processor_id, tag), StandardType<T>(&buf));
842 
843  timpi_assert(src_processor_id < this->size() ||
844  src_processor_id == any_source);
845 
846  timpi_call_mpi
847  (MPI_Recv (&buf, 1, StandardType<T>(&buf), src_processor_id,
848  tag.value(), this->get(), stat.get()));
849 
850  return stat;
851 }
const unsigned int any_source
Processor id meaning "Accept from any source".
Definition: communicator.h:82
processor_id_type size() const
Definition: communicator.h:208
status probe(const unsigned int src_processor_id, const MessageTag &tag=any_tag) const
Blocking message probe.
Definition: communicator.C:283

◆ receive() [2/6]

template<typename T >
void TIMPI::Communicator::receive ( const unsigned int  dest_processor_id,
T &  buf,
Request req,
const MessageTag tag = any_tag 
) const
inline

Nonblocking-receive from one processor with data-defined type.

Definition at line 856 of file parallel_implementation.h.

References TIMPI::Request::add_post_wait_work(), TIMPI::any_source, TIMPI::Request::get(), size(), and TIMPI::MessageTag::value().

860 {
861  TIMPI_LOG_SCOPE("receive()", "Parallel");
862 
863  timpi_assert(src_processor_id < this->size() ||
864  src_processor_id == any_source);
865 
866  timpi_call_mpi
867  (MPI_Irecv (&buf, 1, StandardType<T>(&buf), src_processor_id,
868  tag.value(), this->get(), req.get()));
869 
870  // The MessageTag should stay registered for the Request lifetime
871  req.add_post_wait_work
872  (new PostWaitDereferenceTag(tag));
873 }
const unsigned int any_source
Processor id meaning "Accept from any source".
Definition: communicator.h:82
processor_id_type size() const
Definition: communicator.h:208

◆ receive() [3/6]

template<typename T >
Status TIMPI::Communicator::receive ( const unsigned int  dest_processor_id,
T &  buf,
const DataType type,
const MessageTag tag = any_tag 
) const
inline

Blocking-receive from one processor with user-defined type.

If T is a container, container-of-containers, etc., then type should be the DataType of the underlying fixed-size entries in the container(s).

Definition at line 155 of file serial_implementation.h.

159 { timpi_not_implemented(); return Status(); }

◆ receive() [4/6]

template<typename T >
void TIMPI::Communicator::receive ( const unsigned int  dest_processor_id,
T &  buf,
const DataType type,
Request req,
const MessageTag tag = any_tag 
) const
inline

Nonblocking-receive from one processor with user-defined type.

If T is a container, container-of-containers, etc., then type should be the DataType of the underlying fixed-size entries in the container(s).

Definition at line 162 of file serial_implementation.h.

167 { timpi_not_implemented(); }

◆ receive() [5/6]

template<typename T >
Status TIMPI::Communicator::receive ( const unsigned int  src_processor_id,
std::basic_string< T > &  buf,
const MessageTag tag 
) const
inline

Definition at line 789 of file parallel_implementation.h.

References receive().

792 {
793  std::vector<T> tempbuf; // Officially C++ won't let us get a
794  // modifiable array from a string
795 
796  Status stat = this->receive(src_processor_id, tempbuf, tag);
797  buf.assign(tempbuf.begin(), tempbuf.end());
798  return stat;
799 }
Status receive(const unsigned int dest_processor_id, T &buf, const MessageTag &tag=any_tag) const
Blocking-receive from one processor with data-defined type.

◆ receive() [6/6]

template<typename T >
void TIMPI::Communicator::receive ( const unsigned int  src_processor_id,
std::basic_string< T > &  buf,
Request req,
const MessageTag tag 
) const
inline

Definition at line 804 of file parallel_implementation.h.

References TIMPI::Request::add_post_wait_work(), and receive().

808 {
809  // Officially C++ won't let us get a modifiable array from a
810  // string, and we can't even put one on the stack for the
811  // non-blocking case.
812  std::vector<T> * tempbuf = new std::vector<T>();
813 
814  // We can clear the string, but the Request::wait() will need to
815  // handle copying our temporary buffer to it
816  buf.clear();
817 
818  req.add_post_wait_work
819  (new PostWaitCopyBuffer<std::vector<T>,
820  std::back_insert_iterator<std::basic_string<T>>>
821  (tempbuf, std::back_inserter(buf)));
822 
823  // Make the Request::wait() then handle deleting the buffer
824  req.add_post_wait_work
825  (new PostWaitDeleteBuffer<std::vector<T>>(tempbuf));
826 
827  this->receive(src_processor_id, tempbuf, req, tag);
828 }
Status receive(const unsigned int dest_processor_id, T &buf, const MessageTag &tag=any_tag) const
Blocking-receive from one processor with data-defined type.

◆ receive_packed_range()

template<typename Context , typename OutputIter , typename T >
void TIMPI::Communicator::receive_packed_range ( const unsigned int  dest_processor_id,
Context *  context,
OutputIter  out,
const T *  output_type,
const MessageTag tag = any_tag 
) const
inline

Blocking-receive range-of-pointers from one processor.

This function does not receive raw pointers, but rather constructs new objects whose contents match the objects pointed to by the sender.

The objects will be of type T = iterator_traits<OutputIter>::value_type.

Using std::back_inserter as the output iterator allows receive to fill any container type. Using some null_output_iterator allows the receive to be dealt with solely by TIMPI::unpack(), for objects whose unpack() is written so as to not leak memory when used in this fashion.

A future version of this method should be created to preallocate memory when receiving vectors...

void TIMPI::unpack(vector<int>::iterator in, T ** out, Context *) is used to unserialize type T, typically into a new heap-allocated object whose pointer is returned as *out.

unsigned int TIMPI::packed_size(const T *, vector<int>::const_iterator) is used to advance to the beginning of the next object's data.

Definition at line 1198 of file parallel_implementation.h.

References receive(), TIMPI::Status::source(), TIMPI::Status::tag(), and TIMPI::unpack_range().

Referenced by send_receive_packed_range().

1203 {
1204  typedef typename Packing<T>::buffer_type buffer_t;
1205 
1206  // Receive serialized variable size objects as sequences of buffer_t
1207  std::size_t total_buffer_size = 0;
1208  Status stat = this->receive(src_processor_id, total_buffer_size, tag);
1209 
1210  // Use stat.source() and stat.tag() in subsequent receives - if
1211  // src_processor_id is or tag is "any" then we want to be sure we
1212  // try to receive messages all corresponding to the same send.
1213 
1214  std::size_t received_buffer_size = 0;
1215  while (received_buffer_size < total_buffer_size)
1216  {
1217  std::vector<buffer_t> buffer;
1218  this->receive(stat.source(), buffer, MessageTag(stat.tag()));
1219  received_buffer_size += buffer.size();
1220  unpack_range
1221  (buffer, context, out_iter, output_type);
1222  }
1223 }
void unpack_range(const std::vector< buffertype > &buffer, Context *context, OutputIter out_iter, const T *)
Helper function for range unpacking.
Definition: packing.h:1092
Status receive(const unsigned int dest_processor_id, T &buf, const MessageTag &tag=any_tag) const
Blocking-receive from one processor with data-defined type.

◆ reference_unique_tag()

void TIMPI::Communicator::reference_unique_tag ( int  tagvalue) const

Reference an already-acquired tag, so that we know it will be dereferenced multiple times before we can re-release it.

Definition at line 37 of file communicator.C.

References used_tag_values.

Referenced by TIMPI::MessageTag::MessageTag(), and TIMPI::MessageTag::operator=().

38 {
39  // This had better be an already-acquired tag.
40  timpi_assert(used_tag_values.count(tagvalue));
41 
42  used_tag_values[tagvalue]++;
43 }
std::map< int, unsigned int > used_tag_values
Definition: communicator.h:236

◆ scatter() [1/4]

template<typename T , typename A >
void TIMPI::Communicator::scatter ( const std::vector< T, A > &  data,
T &  recv,
const unsigned int  root_id = 0 
) const
inline

Take a vector of local variables and scatter the ith item to the ith processor in the communicator.

The result is saved into recv.

Definition at line 3443 of file parallel_implementation.h.

References TIMPI::ignore(), rank(), and size().

Referenced by scatter(), and testScatter().

3446 {
3447  ignore(root_id); // Only needed for MPI and/or dbg/devel
3448  timpi_assert_less (root_id, this->size());
3449 
3450  // Do not allow the root_id to scatter a nullptr vector.
3451  // That would leave recv in an indeterminate state.
3452  timpi_assert (this->rank() != root_id || this->size() == data.size());
3453 
3454  if (this->size() == 1)
3455  {
3456  timpi_assert (!this->rank());
3457  timpi_assert (!root_id);
3458  recv = data[0];
3459  return;
3460  }
3461 
3462  TIMPI_LOG_SCOPE("scatter()", "Parallel");
3463 
3464  T * data_ptr = const_cast<T*>(data.empty() ? nullptr : data.data());
3465  ignore(data_ptr); // unused ifndef TIMPI_HAVE_MPI
3466 
3467  timpi_assert_less(root_id, this->size());
3468 
3469  timpi_call_mpi
3470  (MPI_Scatter (data_ptr, 1, StandardType<T>(data_ptr),
3471  &recv, 1, StandardType<T>(&recv), root_id, this->get()));
3472 }
processor_id_type rank() const
Definition: communicator.h:206
void ignore(const Args &...)
Definition: timpi_assert.h:54
processor_id_type size() const
Definition: communicator.h:208

◆ scatter() [2/4]

template<typename T , typename A >
void TIMPI::Communicator::scatter ( const std::vector< T, A > &  data,
std::vector< T, A > &  recv,
const unsigned int  root_id = 0 
) const
inline

Take a vector of local variables and scatter the ith equal-sized chunk to the ith processor in the communicator.

The data size must be a multiple of the communicator size. The result is saved into recv buffer. The recv buffer does not have to be sized prior to this operation.

Definition at line 3477 of file parallel_implementation.h.

References broadcast(), TIMPI::ignore(), rank(), and size().

3480 {
3481  timpi_assert_less (root_id, this->size());
3482 
3483  if (this->size() == 1)
3484  {
3485  timpi_assert (!this->rank());
3486  timpi_assert (!root_id);
3487  recv.assign(data.begin(), data.end());
3488  return;
3489  }
3490 
3491  TIMPI_LOG_SCOPE("scatter()", "Parallel");
3492 
3493  int recv_buffer_size = 0;
3494  if (this->rank() == root_id)
3495  {
3496  timpi_assert(data.size() % this->size() == 0);
3497  recv_buffer_size = cast_int<int>(data.size() / this->size());
3498  }
3499 
3500  this->broadcast(recv_buffer_size);
3501  recv.resize(recv_buffer_size);
3502 
3503  T * data_ptr = const_cast<T*>(data.empty() ? nullptr : data.data());
3504  T * recv_ptr = recv.empty() ? nullptr : recv.data();
3505  ignore(data_ptr, recv_ptr); // unused ifndef TIMPI_HAVE_MPI
3506 
3507  timpi_assert_less(root_id, this->size());
3508 
3509  timpi_call_mpi
3510  (MPI_Scatter (data_ptr, recv_buffer_size, StandardType<T>(data_ptr),
3511  recv_ptr, recv_buffer_size, StandardType<T>(recv_ptr), root_id, this->get()));
3512 }
processor_id_type rank() const
Definition: communicator.h:206
void ignore(const Args &...)
Definition: timpi_assert.h:54
processor_id_type size() const
Definition: communicator.h:208
void broadcast(T &data, const unsigned int root_id=0, const bool identical_sizes=false) const
Take a local value and broadcast it to all processors.

◆ scatter() [3/4]

template<typename T , typename A1 , typename A2 >
void TIMPI::Communicator::scatter ( const std::vector< T, A1 > &  data,
const std::vector< int, A2 >  counts,
std::vector< T, A1 > &  recv,
const unsigned int  root_id = 0 
) const
inline

Take a vector of local variables and scatter the ith variable-sized chunk to the ith processor in the communicator.

The counts vector should contain the number of items for each processor. The result is saved into recv buffer. The recv buffer does not have to be sized prior to this operation.

Definition at line 3517 of file parallel_implementation.h.

References TIMPI::ignore(), rank(), scatter(), and size().

3521 {
3522  timpi_assert_less (root_id, this->size());
3523 
3524  if (this->size() == 1)
3525  {
3526  timpi_assert (!this->rank());
3527  timpi_assert (!root_id);
3528  timpi_assert (counts.size() == this->size());
3529  recv.assign(data.begin(), data.begin() + counts[0]);
3530  return;
3531  }
3532 
3533  std::vector<int,A2> displacements(this->size(), 0);
3534  if (root_id == this->rank())
3535  {
3536  timpi_assert(counts.size() == this->size());
3537 
3538  // Create a displacements vector from the incoming counts vector
3539  unsigned int globalsize = 0;
3540  for (unsigned int i=0; i < this->size(); ++i)
3541  {
3542  displacements[i] = globalsize;
3543  globalsize += counts[i];
3544  }
3545 
3546  timpi_assert(data.size() == globalsize);
3547  }
3548 
3549  TIMPI_LOG_SCOPE("scatter()", "Parallel");
3550 
3551  // Scatter the buffer sizes to size remote buffers
3552  int recv_buffer_size = 0;
3553  this->scatter(counts, recv_buffer_size, root_id);
3554  recv.resize(recv_buffer_size);
3555 
3556  T * data_ptr = const_cast<T*>(data.empty() ? nullptr : data.data());
3557  int * count_ptr = const_cast<int*>(counts.empty() ? nullptr : counts.data());
3558  T * recv_ptr = recv.empty() ? nullptr : recv.data();
3559  ignore(data_ptr, count_ptr, recv_ptr); // unused ifndef TIMPI_HAVE_MPI
3560 
3561  timpi_assert_less(root_id, this->size());
3562 
3563  // Scatter the non-uniform chunks
3564  timpi_call_mpi
3565  (MPI_Scatterv (data_ptr, count_ptr, displacements.data(), StandardType<T>(data_ptr),
3566  recv_ptr, recv_buffer_size, StandardType<T>(recv_ptr), root_id, this->get()));
3567 }
void scatter(const std::vector< T, A > &data, T &recv, const unsigned int root_id=0) const
Take a vector of local variables and scatter the ith item to the ith processor in the communicator...
processor_id_type rank() const
Definition: communicator.h:206
void ignore(const Args &...)
Definition: timpi_assert.h:54
processor_id_type size() const
Definition: communicator.h:208

◆ scatter() [4/4]

template<typename T , typename A1 , typename A2 >
void TIMPI::Communicator::scatter ( const std::vector< std::vector< T, A1 >, A2 > &  data,
std::vector< T, A1 > &  recv,
const unsigned int  root_id = 0,
const bool  identical_buffer_sizes = false 
) const
inline

Take a vector of vectors and scatter the ith inner vector to the ith processor in the communicator.

The result is saved into recv buffer. The recv buffer does not have to be sized prior to this operation.

Definition at line 3572 of file parallel_implementation.h.

References rank(), scatter(), and size().

3576 {
3577  timpi_assert_less (root_id, this->size());
3578 
3579  if (this->size() == 1)
3580  {
3581  timpi_assert (!this->rank());
3582  timpi_assert (!root_id);
3583  timpi_assert (data.size() == this->size());
3584  recv.assign(data[0].begin(), data[0].end());
3585  return;
3586  }
3587 
3588  std::vector<T,A1> stacked_data;
3589  std::vector<int> counts;
3590 
3591  if (root_id == this->rank())
3592  {
3593  timpi_assert (data.size() == this->size());
3594 
3595  if (!identical_buffer_sizes)
3596  counts.resize(this->size());
3597 
3598  for (std::size_t i=0; i < data.size(); ++i)
3599  {
3600  if (!identical_buffer_sizes)
3601  counts[i] = cast_int<int>(data[i].size());
3602 #ifndef NDEBUG
3603  else
3604  // Check that buffer sizes are indeed equal
3605  timpi_assert(!i || data[i-1].size() == data[i].size());
3606 #endif
3607  std::copy(data[i].begin(), data[i].end(), std::back_inserter(stacked_data));
3608  }
3609  }
3610 
3611  if (identical_buffer_sizes)
3612  this->scatter(stacked_data, recv, root_id);
3613  else
3614  this->scatter(stacked_data, counts, recv, root_id);
3615 }
void scatter(const std::vector< T, A > &data, T &recv, const unsigned int root_id=0) const
Take a vector of local variables and scatter the ith item to the ith processor in the communicator...
processor_id_type rank() const
Definition: communicator.h:206
processor_id_type size() const
Definition: communicator.h:208

◆ semiverify()

template<typename T >
bool TIMPI::Communicator::semiverify ( const T *  r) const
inline

Verify that a local pointer points to the same value on all processors where it is not nullptr.

Containers must have the same value in every entry.

Definition at line 2071 of file parallel_implementation.h.

References max(), min(), TIMPI::Attributes< T >::set_highest(), TIMPI::Attributes< T >::set_lowest(), and size().

Referenced by testSemiVerifyInf(), testSemiVerifyString(), testSemiVerifyType(), and testSemiVerifyVector().

2072 {
2073  if (this->size() > 1 && Attributes<T>::has_min_max == true)
2074  {
2075  T tempmin, tempmax;
2076  if (r)
2077  tempmin = tempmax = *r;
2078  else
2079  {
2080  Attributes<T>::set_highest(tempmin);
2081  Attributes<T>::set_lowest(tempmax);
2082  }
2083  this->min(tempmin);
2084  this->max(tempmax);
2085  bool invalid = r && ((*r != tempmin) ||
2086  (*r != tempmax));
2087  this->max(invalid);
2088  return !invalid;
2089  }
2090 
2091  static_assert(Attributes<T>::has_min_max,
2092  "Tried to semiverify an unverifiable type");
2093 
2094  return true;
2095 }
static void set_lowest(T &)
Definition: attributes.h:49
processor_id_type size() const
Definition: communicator.h:208
void min(const T &r, T &o, Request &req) const
Non-blocking minimum of the local value r into o with the request req.
static const bool has_min_max
Definition: attributes.h:48
static void set_highest(T &)
Definition: attributes.h:50
void max(const T &r, T &o, Request &req) const
Non-blocking maximum of the local value r into o with the request req.

◆ send() [1/7]

template<typename T >
void TIMPI::Communicator::send ( const unsigned int  dest_processor_id,
const std::basic_string< T > &  buf,
const MessageTag tag 
) const
inline

Definition at line 264 of file parallel_implementation.h.

References send_mode(), size(), SYNCHRONOUS, and TIMPI::MessageTag::value().

267 {
268  TIMPI_LOG_SCOPE("send()", "Parallel");
269 
270  T * dataptr = buf.empty() ? nullptr : const_cast<T *>(buf.data());
271 
272  timpi_assert_less(dest_processor_id, this->size());
273 
274  timpi_call_mpi
275  (((this->send_mode() == SYNCHRONOUS) ?
276  MPI_Ssend : MPI_Send) (dataptr,
277  cast_int<int>(buf.size()),
278  StandardType<T>(dataptr),
279  dest_processor_id,
280  tag.value(),
281  this->get()));
282 }
processor_id_type size() const
Definition: communicator.h:208
SendMode send_mode() const
Gets the user-requested SendMode.
Definition: communicator.h:331

◆ send() [2/7]

template<typename T >
void TIMPI::Communicator::send ( const unsigned int  dest_processor_id,
const std::basic_string< T > &  buf,
Request req,
const MessageTag tag 
) const
inline

Definition at line 287 of file parallel_implementation.h.

References TIMPI::Request::add_post_wait_work(), TIMPI::Request::get(), send_mode(), size(), SYNCHRONOUS, and TIMPI::MessageTag::value().

291 {
292  TIMPI_LOG_SCOPE("send()", "Parallel");
293 
294  T * dataptr = buf.empty() ? nullptr : const_cast<T *>(buf.data());
295 
296  timpi_assert_less(dest_processor_id, this->size());
297 
298  timpi_call_mpi
299  (((this->send_mode() == SYNCHRONOUS) ?
300  MPI_Issend : MPI_Isend) (dataptr,
301  cast_int<int>(buf.size()),
302  StandardType<T>(dataptr),
303  dest_processor_id,
304  tag.value(),
305  this->get(),
306  req.get()));
307 
308  // The MessageTag should stay registered for the Request lifetime
309  req.add_post_wait_work
310  (new PostWaitDereferenceTag(tag));
311 }
processor_id_type size() const
Definition: communicator.h:208
SendMode send_mode() const
Gets the user-requested SendMode.
Definition: communicator.h:331

◆ send() [3/7]

template<typename T >
void TIMPI::Communicator::send ( const unsigned int  dest_processor_id,
const T &  buf,
const MessageTag tag = no_tag 
) const
inline

Blocking-send to one processor with data-defined type.

We do not currently support sends on one processor without MPI.

Definition at line 316 of file parallel_implementation.h.

References send_mode(), size(), SYNCHRONOUS, and TIMPI::MessageTag::value().

Referenced by nonblocking_send_packed_range(), TIMPI::pull_parallel_vector_data(), TIMPI::push_parallel_vector_data(), send_packed_range(), send_receive(), testIrecvSend(), testIsendRecv(), testRecvIsendSets(), testRecvIsendVecVecs(), and testSendRecvVecVecs().

319 {
320  TIMPI_LOG_SCOPE("send()", "Parallel");
321 
322  T * dataptr = const_cast<T*> (&buf);
323 
324  timpi_assert_less(dest_processor_id, this->size());
325 
326  timpi_call_mpi
327  (((this->send_mode() == SYNCHRONOUS) ?
328  MPI_Ssend : MPI_Send) (dataptr,
329  1,
330  StandardType<T>(dataptr),
331  dest_processor_id,
332  tag.value(),
333  this->get()));
334 }
processor_id_type size() const
Definition: communicator.h:208
SendMode send_mode() const
Gets the user-requested SendMode.
Definition: communicator.h:331

◆ send() [4/7]

template<typename T >
void TIMPI::Communicator::send ( const unsigned int  dest_processor_id,
const T &  buf,
Request req,
const MessageTag tag = no_tag 
) const
inline

Nonblocking-send to one processor with data-defined type.

Definition at line 339 of file parallel_implementation.h.

References TIMPI::Request::add_post_wait_work(), TIMPI::Request::get(), send_mode(), size(), SYNCHRONOUS, and TIMPI::MessageTag::value().

343 {
344  TIMPI_LOG_SCOPE("send()", "Parallel");
345 
346  T * dataptr = const_cast<T*>(&buf);
347 
348  timpi_assert_less(dest_processor_id, this->size());
349 
350  timpi_call_mpi
351  (((this->send_mode() == SYNCHRONOUS) ?
352  MPI_Issend : MPI_Isend) (dataptr,
353  1,
354  StandardType<T>(dataptr),
355  dest_processor_id,
356  tag.value(),
357  this->get(),
358  req.get()));
359 
360  // The MessageTag should stay registered for the Request lifetime
361  req.add_post_wait_work
362  (new PostWaitDereferenceTag(tag));
363 }
processor_id_type size() const
Definition: communicator.h:208
SendMode send_mode() const
Gets the user-requested SendMode.
Definition: communicator.h:331

◆ send() [5/7]

template<typename T >
void TIMPI::Communicator::send ( const unsigned int  dest_processor_id,
const T &  buf,
const DataType type,
const MessageTag tag = no_tag 
) const
inline

Blocking-send to one processor with user-defined type.

If T is a container, container-of-containers, etc., then type should be the DataType of the underlying fixed-size entries in the container(s).

Definition at line 78 of file serial_implementation.h.

82 { timpi_not_implemented(); }

◆ send() [6/7]

template<typename T >
void TIMPI::Communicator::send ( const unsigned int  dest_processor_id,
const T &  buf,
const DataType type,
Request req,
const MessageTag tag = no_tag 
) const
inline

Nonblocking-send to one processor with user-defined type.

If T is a container, container-of-containers, etc., then type should be the DataType of the underlying fixed-size entries in the container(s).

Definition at line 85 of file serial_implementation.h.

90 { timpi_not_implemented(); }

◆ send() [7/7]

template<typename T >
void TIMPI::Communicator::send ( const unsigned int  dest_processor_id,
const T &  buf,
const NotADataType type,
Request req,
const MessageTag tag = no_tag 
) const
inline

Nonblocking-send to one processor with user-defined packable type.

Packing<T> must be defined for T

Definition at line 93 of file serial_implementation.h.

98 { timpi_not_implemented(); }

◆ send_mode() [1/2]

void TIMPI::Communicator::send_mode ( const SendMode  sm)
inline

Explicitly sets the SendMode type used for send operations.

Definition at line 326 of file communicator.h.

References _send_mode.

Referenced by duplicate(), TIMPI::detail::push_parallel_nbx_helper(), split(), split_by_type(), testIrecvSend(), and testIsendRecv().

326 { _send_mode = sm; }

◆ send_mode() [2/2]

SendMode TIMPI::Communicator::send_mode ( ) const
inline

Gets the user-requested SendMode.

Definition at line 331 of file communicator.h.

References _send_mode.

Referenced by duplicate(), send(), split(), and split_by_type().

331 { return _send_mode; }

◆ send_packed_range() [1/2]

template<typename Context , typename Iter >
void TIMPI::Communicator::send_packed_range ( const unsigned int  dest_processor_id,
const Context *  context,
Iter  range_begin,
const Iter  range_end,
const MessageTag tag = no_tag,
std::size_t  approx_buffer_size = 1000000 
) const
inline

Blocking-send range-of-pointers to one processor.

This function does not send the raw pointers, but rather constructs new objects at the other end whose contents match the objects pointed to by the sender.

void TIMPI::pack(const T *, vector<int> & data, const Context *) is used to serialize type T onto the end of a data vector.

unsigned int TIMPI::packable_size(const T *, const Context *) is used to allow data vectors to reserve memory, and for additional error checking

The approximate maximum size (in entries; number of bytes will likely be 4x or 8x larger) to use in a single data vector buffer can be specified for performance or memory usage reasons; if the range cannot be packed into a single buffer of this size then multiple buffers and messages will be used.

Definition at line 625 of file parallel_implementation.h.

References TIMPI::pack_range(), TIMPI::packed_range_size(), and send().

Referenced by send_receive_packed_range().

631 {
632  // We will serialize variable size objects from *range_begin to
633  // *range_end as a sequence of plain data (e.g. ints) in this buffer
634  typedef typename std::iterator_traits<Iter>::value_type T;
635 
636  std::size_t total_buffer_size =
637  packed_range_size (context, range_begin, range_end);
638 
639  this->send(dest_processor_id, total_buffer_size, tag);
640 
641 #ifdef DEBUG
642  std::size_t used_buffer_size = 0;
643 #endif
644 
645  while (range_begin != range_end)
646  {
647  timpi_assert_greater (std::distance(range_begin, range_end), 0);
648 
649  std::vector<typename Packing<T>::buffer_type> buffer;
650 
651  const Iter next_range_begin = pack_range
652  (context, range_begin, range_end, buffer, approx_buffer_size);
653 
654  timpi_assert_greater (std::distance(range_begin, next_range_begin), 0);
655 
656  range_begin = next_range_begin;
657 
658 #ifdef DEBUG
659  used_buffer_size += buffer.size();
660 #endif
661 
662  // Blocking send of the buffer
663  this->send(dest_processor_id, buffer, tag);
664  }
665 
666 #ifdef DEBUG
667  timpi_assert_equal_to(used_buffer_size, total_buffer_size);
668 #endif
669 }
Iter pack_range(const Context *context, Iter range_begin, const Iter range_end, std::vector< buffertype > &buffer, std::size_t approx_buffer_size)
Helper function for range packing.
Definition: packing.h:1037
void send(const unsigned int dest_processor_id, const T &buf, const MessageTag &tag=no_tag) const
Blocking-send to one processor with data-defined type.
std::size_t packed_range_size(const Context *context, Iter range_begin, const Iter range_end)
Helper function for range packing.
Definition: packing.h:1016

◆ send_packed_range() [2/2]

template<typename Context , typename Iter >
void TIMPI::Communicator::send_packed_range ( const unsigned int  dest_processor_id,
const Context *  context,
Iter  range_begin,
const Iter  range_end,
Request req,
const MessageTag tag = no_tag,
std::size_t  approx_buffer_size = 1000000 
) const
inline

Nonblocking-send range-of-pointers to one processor.

This function does not send the raw pointers, but rather constructs new objects at the other end whose contents match the objects pointed to by the sender.

void TIMPI::pack(const T *, vector<int> & data, const Context *) is used to serialize type T onto the end of a data vector.

unsigned int TIMPI::packable_size(const T *, const Context *) is used to allow data vectors to reserve memory, and for additional error checking

The approximate maximum size (in entries; number of bytes will likely be 4x or 8x larger) to use in a single data vector buffer can be specified for performance or memory usage reasons; if the range cannot be packed into a single buffer of this size then multiple buffers and messages will be used.

Definition at line 673 of file parallel_implementation.h.

References TIMPI::Request::add_post_wait_work(), TIMPI::Request::add_prior_request(), TIMPI::pack_range(), TIMPI::packed_range_size(), and send().

680 {
681  // Allocate a buffer on the heap so we don't have to free it until
682  // after the Request::wait()
683  typedef typename std::iterator_traits<Iter>::value_type T;
684  typedef typename Packing<T>::buffer_type buffer_t;
685 
686  std::size_t total_buffer_size =
687  packed_range_size (context, range_begin, range_end);
688 
689  // That local variable will be gone soon; we need a send buffer that
690  // will stick around. I heard you like buffering so I put a buffer
691  // for your buffer size so you can buffer the size of your buffer.
692  std::size_t * total_buffer_size_buffer = new std::size_t;
693  *total_buffer_size_buffer = total_buffer_size;
694 
695  // Delete the buffer size's buffer when we're done
696  Request intermediate_req = request();
697  intermediate_req.add_post_wait_work
698  (new PostWaitDeleteBuffer<std::size_t>(total_buffer_size_buffer));
699  this->send(dest_processor_id, *total_buffer_size_buffer, intermediate_req, tag);
700 
701  // And don't finish up the full request until we're done with its
702  // dependencies
703  req.add_prior_request(intermediate_req);
704 
705 #ifdef DEBUG
706  std::size_t used_buffer_size = 0;
707 #endif
708 
709  while (range_begin != range_end)
710  {
711  timpi_assert_greater (std::distance(range_begin, range_end), 0);
712 
713  std::vector<buffer_t> * buffer = new std::vector<buffer_t>();
714 
715  const Iter next_range_begin = pack_range
716  (context, range_begin, range_end, *buffer, approx_buffer_size);
717 
718  timpi_assert_greater (std::distance(range_begin, next_range_begin), 0);
719 
720  range_begin = next_range_begin;
721 
722 #ifdef DEBUG
723  used_buffer_size += buffer->size();
724 #endif
725 
726  Request next_intermediate_req;
727 
728  Request * my_req = (range_begin == range_end) ? &req : &next_intermediate_req;
729 
730  // Make the Request::wait() handle deleting the buffer
731  my_req->add_post_wait_work
732  (new PostWaitDeleteBuffer<std::vector<buffer_t>>
733  (buffer));
734 
735  // Non-blocking send of the buffer
736  this->send(dest_processor_id, *buffer, *my_req, tag);
737 
738  if (range_begin != range_end)
739  req.add_prior_request(*my_req);
740  }
741 }
MPI_Request request
Request object for non-blocking I/O.
Definition: request.h:41
Iter pack_range(const Context *context, Iter range_begin, const Iter range_end, std::vector< buffertype > &buffer, std::size_t approx_buffer_size)
Helper function for range packing.
Definition: packing.h:1037
void send(const unsigned int dest_processor_id, const T &buf, const MessageTag &tag=no_tag) const
Blocking-send to one processor with data-defined type.
std::size_t packed_range_size(const Context *context, Iter range_begin, const Iter range_end)
Helper function for range packing.
Definition: packing.h:1016

◆ send_receive() [1/7]

template<typename T1 , typename T2 , typename std::enable_if< std::is_base_of< DataType, StandardType< T1 >>::value &&std::is_base_of< DataType, StandardType< T2 >>::value, int >::type >
void TIMPI::Communicator::send_receive ( const unsigned int   timpi_dbg_varsend_tgt,
const T1 &  send_val,
const unsigned int   timpi_dbg_varrecv_source,
T2 &  recv_val,
const MessageTag ,
const MessageTag  
) const
inline

Send-receive data from one processor.

Definition at line 189 of file serial_implementation.h.

195 {
196  timpi_assert_equal_to (send_tgt, 0);
197  timpi_assert_equal_to (recv_source, 0);
198  recv_val = send_val;
199 }

◆ send_receive() [2/7]

template<typename T , typename A , typename std::enable_if< Has_buffer_type< Packing< T >>::value, int >::type >
void TIMPI::Communicator::send_receive ( const unsigned int   timpi_dbg_vardest_processor_id,
const std::vector< T, A > &  send,
const unsigned int   timpi_dbg_varsource_processor_id,
std::vector< T, A > &  recv,
const MessageTag ,
const MessageTag  
) const
inline

Definition at line 238 of file serial_implementation.h.

References send().

244 {
245  timpi_assert_equal_to (dest_processor_id, 0);
246  timpi_assert_equal_to (source_processor_id, 0);
247  recv = send;
248 }
void send(const unsigned int dest_processor_id, const T &buf, const MessageTag &tag=no_tag) const
Blocking-send to one processor with data-defined type.

◆ send_receive() [3/7]

template<typename T , typename A1 , typename A2 >
void TIMPI::Communicator::send_receive ( const unsigned int   timpi_dbg_vardest_processor_id,
const std::vector< std::vector< T, A1 >, A2 > &  send,
const unsigned int   timpi_dbg_varsource_processor_id,
std::vector< std::vector< T, A1 >, A2 > &  recv,
const MessageTag ,
const MessageTag  
) const
inline

Definition at line 297 of file serial_implementation.h.

References send().

303 {
304  timpi_assert_equal_to (dest_processor_id, 0);
305  timpi_assert_equal_to (source_processor_id, 0);
306  recv = send;
307 }
void send(const unsigned int dest_processor_id, const T &buf, const MessageTag &tag=no_tag) const
Blocking-send to one processor with data-defined type.

◆ send_receive() [4/7]

template<typename T1 , typename T2 , typename std::enable_if< std::is_base_of< DataType, StandardType< T1 >>::value &&std::is_base_of< DataType, StandardType< T2 >>::value, int >::type >
void TIMPI::Communicator::send_receive ( const unsigned int  dest_processor_id,
const T1 &  send_data,
const unsigned int  source_processor_id,
T2 &  recv_data,
const MessageTag send_tag = no_tag,
const MessageTag recv_tag = any_tag 
) const
inline

Send data send to one processor while simultaneously receiving other data recv from a (potentially different) processor.

This overload is defined for fixed-size data; other overloads exist for many other categories.

Definition at line 1362 of file parallel_implementation.h.

References TIMPI::any_source, rank(), size(), and TIMPI::MessageTag::value().

Referenced by TIMPI::push_parallel_vector_data().

1368 {
1369  TIMPI_LOG_SCOPE("send_receive()", "Parallel");
1370 
1371  if (dest_processor_id == this->rank() &&
1372  source_processor_id == this->rank())
1373  {
1374  recv = sendvec;
1375  return;
1376  }
1377 
1378  timpi_assert_less(dest_processor_id, this->size());
1379  timpi_assert(source_processor_id < this->size() ||
1380  source_processor_id == any_source);
1381 
1382  // MPI_STATUS_IGNORE is from MPI-2; using it with some versions of
1383  // MPICH may cause a crash:
1384  // https://bugzilla.mcs.anl.gov/globus/show_bug.cgi?id=1798
1385  timpi_call_mpi
1386  (MPI_Sendrecv(const_cast<T1*>(&sendvec), 1, StandardType<T1>(&sendvec),
1387  dest_processor_id, send_tag.value(), &recv, 1,
1388  StandardType<T2>(&recv), source_processor_id,
1389  recv_tag.value(), this->get(), MPI_STATUS_IGNORE));
1390 }
processor_id_type rank() const
Definition: communicator.h:206
const unsigned int any_source
Processor id meaning "Accept from any source".
Definition: communicator.h:82
processor_id_type size() const
Definition: communicator.h:208

◆ send_receive() [5/7]

template<typename T1 , typename T2 >
void TIMPI::Communicator::send_receive ( const unsigned int  dest_processor_id,
const T1 &  send_data,
const DataType type1,
const unsigned int  source_processor_id,
T2 &  recv_data,
const DataType type2,
const MessageTag send_tag = no_tag,
const MessageTag recv_tag = any_tag 
) const
inline

Send data send to one processor while simultaneously receiving other data recv from a (potentially different) processor, using a user-specified MPI Dataype.

◆ send_receive() [6/7]

template<typename T , typename A , typename std::enable_if< std::is_base_of< DataType, StandardType< T >>::value, int >::type >
void TIMPI::Communicator::send_receive ( const unsigned int  dest_processor_id,
const std::vector< T, A > &  send_data,
const unsigned int  source_processor_id,
std::vector< T, A > &  recv_data,
const MessageTag send_tag,
const MessageTag recv_tag 
) const
inline

Definition at line 1341 of file parallel_implementation.h.

References send_receive_packed_range().

1347 {
1348  this->send_receive_packed_range(dest_processor_id, (void *)(nullptr),
1349  send_data.begin(), send_data.end(),
1350  source_processor_id, (void *)(nullptr),
1351  std::back_inserter(recv_data),
1352  (const T *)(nullptr),
1353  send_tag, recv_tag);
1354 }
void send_receive_packed_range(const unsigned int dest_processor_id, const Context1 *context1, RangeIter send_begin, const RangeIter send_end, const unsigned int source_processor_id, Context2 *context2, OutputIter out, const T *output_type, const MessageTag &send_tag=no_tag, const MessageTag &recv_tag=any_tag, std::size_t approx_buffer_size=1000000) const
Send a range-of-pointers to one processor while simultaneously receiving another range from a (potent...

◆ send_receive() [7/7]

template<typename T , typename A1 , typename A2 >
void TIMPI::Communicator::send_receive ( const unsigned int  dest_processor_id,
const std::vector< std::vector< T, A1 >, A2 > &  sendvec,
const unsigned int  source_processor_id,
std::vector< std::vector< T, A1 >, A2 > &  recv,
const MessageTag send_tag,
const MessageTag recv_tag 
) const
inline

Definition at line 1474 of file parallel_implementation.h.

1480 {
1481  send_receive_vec_of_vec
1482  (dest_processor_id, sendvec, source_processor_id, recv,
1483  send_tag, recv_tag, *this);
1484 }

◆ send_receive_packed_range() [1/2]

template<typename Context1 , typename RangeIter , typename Context2 , typename OutputIter , typename T >
void TIMPI::Communicator::send_receive_packed_range ( const unsigned int   timpi_dbg_vardest_processor_id,
const Context1 *  context1,
RangeIter  send_begin,
const RangeIter  send_end,
const unsigned int   timpi_dbg_varsource_processor_id,
Context2 *  context2,
OutputIter  out_iter,
const T *  output_type,
const MessageTag ,
const MessageTag ,
std::size_t   
) const
inline

Send-receive range-of-pointers from one processor.

If you call this without MPI you might be making a mistake, but we'll support it.

Definition at line 322 of file serial_implementation.h.

References TIMPI::pack_range(), and TIMPI::unpack_range().

333 {
334  // This makes no sense on one processor unless we're deliberately
335  // sending to ourself.
336  timpi_assert_equal_to(dest_processor_id, 0);
337  timpi_assert_equal_to(source_processor_id, 0);
338 
339  // On one processor, we just need to pack the range and then unpack
340  // it again.
341  typedef typename std::iterator_traits<RangeIter>::value_type T1;
342  typedef typename Packing<T1>::buffer_type buffer_t;
343 
344  while (send_begin != send_end)
345  {
346  timpi_assert_greater (std::distance(send_begin, send_end), 0);
347 
348  // We will serialize variable size objects from *range_begin to
349  // *range_end as a sequence of ints in this buffer
350  std::vector<buffer_t> buffer;
351 
352  const RangeIter next_send_begin = pack_range
353  (context1, send_begin, send_end, buffer);
354 
355  timpi_assert_greater (std::distance(send_begin, next_send_begin), 0);
356 
357  send_begin = next_send_begin;
358 
360  (buffer, context2, out_iter, output_type);
361  }
362 }
void unpack_range(const std::vector< buffertype > &buffer, Context *context, OutputIter out_iter, const T *)
Helper function for range unpacking.
Definition: packing.h:1092
Iter pack_range(const Context *context, Iter range_begin, const Iter range_end, std::vector< buffertype > &buffer, std::size_t approx_buffer_size)
Helper function for range packing.
Definition: packing.h:1037

◆ send_receive_packed_range() [2/2]

template<typename Context1 , typename RangeIter , typename Context2 , typename OutputIter , typename T >
void TIMPI::Communicator::send_receive_packed_range ( const unsigned int  dest_processor_id,
const Context1 *  context1,
RangeIter  send_begin,
const RangeIter  send_end,
const unsigned int  source_processor_id,
Context2 *  context2,
OutputIter  out,
const T *  output_type,
const MessageTag send_tag = no_tag,
const MessageTag recv_tag = any_tag,
std::size_t  approx_buffer_size = 1000000 
) const
inline

Send a range-of-pointers to one processor while simultaneously receiving another range from a (potentially different) processor.

This function does not send or receive raw pointers, but rather constructs new objects at each receiver whose contents match the objects pointed to by the sender.

The objects being sent will be of type T1 = iterator_traits<RangeIter>::value_type, and the objects being received will be of type T2 = iterator_traits<OutputIter>::value_type

void TIMPI::pack(const T1*, vector<int> & data, const Context1*) is used to serialize type T1 onto the end of a data vector.

Using std::back_inserter as the output iterator allows send_receive to fill any container type. Using some null_output_iterator allows the receive to be dealt with solely by TIMPI::unpack(), for objects whose unpack() is written so as to not leak memory when used in this fashion.

A future version of this method should be created to preallocate memory when receiving vectors...

void TIMPI::unpack(vector<int>::iterator in, T2** out, Context *) is used to unserialize type T2, typically into a new heap-allocated object whose pointer is returned as *out.

unsigned int TIMPI::packable_size(const T1*, const Context1*) is used to allow data vectors to reserve memory, and for additional error checking.

unsigned int TIMPI::packed_size(const T2*, vector<int>::const_iterator) is used to advance to the beginning of the next object's data.

Definition at line 1492 of file parallel_implementation.h.

References TIMPI::pack_range(), rank(), receive_packed_range(), send_packed_range(), TIMPI::unpack_range(), and TIMPI::Request::wait().

Referenced by TIMPI::push_parallel_packed_range(), send_receive(), testContainerSendReceive(), and testNullSendReceive().

1503 {
1504  TIMPI_LOG_SCOPE("send_receive()", "Parallel");
1505 
1506  timpi_assert_equal_to
1507  ((dest_processor_id == this->rank()),
1508  (source_processor_id == this->rank()));
1509 
1510  if (dest_processor_id == this->rank() &&
1511  source_processor_id == this->rank())
1512  {
1513  // We need to pack and unpack, even if we don't need to
1514  // communicate the buffer, just in case user Packing
1515  // specializations have side effects
1516 
1517  typedef typename Packing<T>::buffer_type buffer_t;
1518  while (send_begin != send_end)
1519  {
1520  std::vector<buffer_t> buffer;
1521  send_begin = pack_range
1522  (context1, send_begin, send_end, buffer, approx_buffer_size);
1523  unpack_range
1524  (buffer, context2, out_iter, output_type);
1525  }
1526  return;
1527  }
1528 
1529  Request req;
1530 
1531  this->send_packed_range (dest_processor_id, context1, send_begin, send_end,
1532  req, send_tag, approx_buffer_size);
1533 
1534  this->receive_packed_range (source_processor_id, context2, out_iter,
1535  output_type, recv_tag);
1536 
1537  req.wait();
1538 }
processor_id_type rank() const
Definition: communicator.h:206
void unpack_range(const std::vector< buffertype > &buffer, Context *context, OutputIter out_iter, const T *)
Helper function for range unpacking.
Definition: packing.h:1092
void send_packed_range(const unsigned int dest_processor_id, const Context *context, Iter range_begin, const Iter range_end, const MessageTag &tag=no_tag, std::size_t approx_buffer_size=1000000) const
Blocking-send range-of-pointers to one processor.
void receive_packed_range(const unsigned int dest_processor_id, Context *context, OutputIter out, const T *output_type, const MessageTag &tag=any_tag) const
Blocking-receive range-of-pointers from one processor.
Iter pack_range(const Context *context, Iter range_begin, const Iter range_end, std::vector< buffertype > &buffer, std::size_t approx_buffer_size)
Helper function for range packing.
Definition: packing.h:1037

◆ set_union() [1/2]

template<typename T >
void TIMPI::Communicator::set_union ( T &  data,
const unsigned int  root_id 
) const
inline

Take a container (set, map, unordered_set, multimap, etc) of local variables on each processor, and collect their union over all processors, replacing the original on processor 0.

If the data is a map or unordered_map and entries exist on different processors with the same key and different values, then the value with the lowest processor id takes precedence.

Referenced by testBigUnion(), testMapMap(), testMapSet(), testPackedSetUnion(), and testUnion().

◆ set_union() [2/2]

template<typename T >
void TIMPI::Communicator::set_union ( T &  data) const
inline

Take a container of local variables on each processor, and replace it with their union over all processors, replacing the original on all processors.

◆ size()

processor_id_type TIMPI::Communicator::size ( ) const
inline

Definition at line 208 of file communicator.h.

References _size.

Referenced by allgather(), alltoall(), barrier(), broadcast(), broadcast_packed_range(), gather(), map_broadcast(), map_max(), map_sum(), max(), maxloc(), min(), minloc(), nonblocking_barrier(), packed_range_probe(), packed_size_of(), possibly_receive(), possibly_receive_packed_range(), probe(), TIMPI::pull_parallel_vector_data(), TIMPI::detail::push_parallel_alltoall_helper(), TIMPI::detail::push_parallel_nbx_helper(), TIMPI::detail::push_parallel_roundrobin_helper(), receive(), scatter(), semiverify(), send(), send_receive(), sum(), testAllGatherString(), testArrayContainerAllGather(), testBigUnion(), testContainerAllGather(), testContainerSendReceive(), testEmptyEntry(), testGettableStringAllGather(), testIrecvSend(), testIsendRecv(), testMapContainerAllGather(), testMapMax(), testMax(), testMaxloc(), testMaxlocBool(), testMaxlocDouble(), testMaxVecBool(), testMinloc(), testMinlocBool(), testMinlocDouble(), testMinVecBool(), testNestingAllGather(), testNonblockingMax(), testNonblockingSum(), testNonFixedTypeMapMax(), testNonFixedTypeSum(), testNullSendReceive(), testPackedSetUnion(), testPairContainerAllGather(), testPull(), testPullImpl(), testPullOversized(), testPullPacked(), testPullVecVec(), testPullVecVecImpl(), testPullVecVecOversized(), testPush(), testPushImpl(), testPushMove(), testPushMultimap(), testPushMultimapImpl(), testPushMultimapOversized(), testPushMultimapVecVec(), testPushMultimapVecVecImpl(), testPushMultimapVecVecOversized(), testPushOversized(), testPushPacked(), testPushPackedImpl(), testPushPackedImplMove(), testPushPackedMove(), testPushPackedMoveOversized(), testPushPackedNestedImpl(), testPushPackedOversized(), testPushVecVec(), testPushVecVecImpl(), testPushVecVecOversized(), testRecvIsendSets(), testRecvIsendVecVecs(), testScatter(), testSendRecvVecVecs(), testSplit(), testSplitByType(), testSum(), testSumOpFunction(), testTupleContainerAllGather(), testTupleStringAllGather(), testUnion(), testVectorOfContainersAllGather(), testVectorOfContainersBroadcast(), and verify().

208 { return _size; }
processor_id_type _size
Definition: communicator.h:230

◆ split()

void TIMPI::Communicator::split ( int  color,
int  key,
Communicator target 
) const

Definition at line 97 of file communicator.C.

References _I_duped_it, assign(), clear(), send_mode(), and sync_type().

Referenced by testSplit().

98 {
99  target.clear();
100  MPI_Comm newcomm;
101  timpi_call_mpi
102  (MPI_Comm_split(this->get(), color, key, &newcomm));
103 
104  target.assign(newcomm);
105  target._I_duped_it = (color != MPI_UNDEFINED);
106  target.send_mode(this->send_mode());
107  target.sync_type(this->sync_type());
108 }
SyncType sync_type() const
Gets the user-requested SyncType.
Definition: communicator.h:348
SendMode send_mode() const
Gets the user-requested SendMode.
Definition: communicator.h:331

◆ split_by_type()

void TIMPI::Communicator::split_by_type ( int  split_type,
int  key,
info  i,
Communicator target 
) const

Definition at line 111 of file communicator.C.

References _I_duped_it, assign(), clear(), send_mode(), and sync_type().

Referenced by testSplitByType().

112 {
113  target.clear();
114  MPI_Comm newcomm;
115  timpi_call_mpi
116  (MPI_Comm_split_type(this->get(), split_type, key, i, &newcomm));
117 
118  target.assign(newcomm);
119  target._I_duped_it = (split_type != MPI_UNDEFINED);
120  target.send_mode(this->send_mode());
121  target.sync_type(this->sync_type());
122 }
SyncType sync_type() const
Gets the user-requested SyncType.
Definition: communicator.h:348
SendMode send_mode() const
Gets the user-requested SendMode.
Definition: communicator.h:331

◆ sum() [1/4]

template<typename T >
void TIMPI::Communicator::sum ( T &  r) const
inline

Take a local variable and replace it with the sum of it's values on all processors.

Containers are replaced element-wise.

Referenced by testNonblockingSum(), testNonFixedTypeSum(), testSum(), and testSumOpFunction().

◆ sum() [2/4]

template<typename T >
void TIMPI::Communicator::sum ( const T &  r,
T &  o,
Request req 
) const
inline

Non-blocking sum of the local value r into o with the request req.

Definition at line 2619 of file parallel_implementation.h.

References TIMPI::Request::get(), TIMPI::Request::null_request, and size().

2622 {
2623  if (this->size() > 1)
2624  {
2625  TIMPI_LOG_SCOPE("sum()", "Parallel");
2626 
2627  timpi_call_mpi
2628  (MPI_Iallreduce (&r, &o, 1,
2629  StandardType<T>(&r),
2630  OpFunction<T>::sum(),
2631  this->get(),
2632  req.get()));
2633  }
2634  else
2635  {
2636  o = r;
2637  req = Request::null_request;
2638  }
2639 }
static const request null_request
Definition: request.h:111
processor_id_type size() const
Definition: communicator.h:208

◆ sum() [3/4]

template<typename T >
void TIMPI::Communicator::sum ( T &  timpi_mpi_varr) const
inline

Definition at line 2643 of file parallel_implementation.h.

References size().

2644 {
2645  if (this->size() > 1)
2646  {
2647  TIMPI_LOG_SCOPE("sum()", "Parallel");
2648 
2649  timpi_call_mpi
2650  (MPI_Allreduce (MPI_IN_PLACE, &r, 1,
2651  StandardType<T>(&r),
2652  OpFunction<T>::sum(),
2653  this->get()));
2654  }
2655 }
processor_id_type size() const
Definition: communicator.h:208

◆ sum() [4/4]

template<typename T >
void TIMPI::Communicator::sum ( std::complex< T > &  timpi_mpi_varr) const
inline

Definition at line 2680 of file parallel_implementation.h.

References size().

2681 {
2682  if (this->size() > 1)
2683  {
2684  TIMPI_LOG_SCOPE("sum()", "Parallel");
2685 
2686  timpi_call_mpi
2687  (MPI_Allreduce (MPI_IN_PLACE, &r, 2,
2688  StandardType<T>(),
2689  OpFunction<T>::sum(),
2690  this->get()));
2691  }
2692 }
processor_id_type size() const
Definition: communicator.h:208

◆ sync_type() [1/3]

void TIMPI::Communicator::sync_type ( const SyncType  st)
inline

◆ sync_type() [2/3]

void TIMPI::Communicator::sync_type ( const std::string &  st)

Sets the sync type used for sync operations via a string.

Useful for changing the sync type via a CLI arg or parameter.

Definition at line 455 of file communicator.C.

References ALLTOALL_COUNTS, NBX, SENDRECEIVE, and sync_type().

456 {
457  SyncType type = NBX;
458  if (st == "sendreceive")
459  type = SENDRECEIVE;
460  else if (st == "alltoall")
461  type = ALLTOALL_COUNTS;
462  else if (st != "nbx")
463  timpi_error_msg("Unrecognized TIMPI sync type " << st);
464  this->sync_type(type);
465 }
SyncType sync_type() const
Gets the user-requested SyncType.
Definition: communicator.h:348
SyncType
What algorithm to use for parallel synchronization?
Definition: communicator.h:218

◆ sync_type() [3/3]

SyncType TIMPI::Communicator::sync_type ( ) const
inline

Gets the user-requested SyncType.

Definition at line 348 of file communicator.h.

References _sync_type.

Referenced by duplicate(), split(), split_by_type(), and sync_type().

348 { return _sync_type; }

◆ verify()

template<typename T >
bool TIMPI::Communicator::verify ( const T &  r) const
inline

Verify that a local variable has the same value on all processors.

Containers must have the same value in every entry.

Definition at line 2051 of file parallel_implementation.h.

References max(), min(), and size().

Referenced by allgather(), alltoall(), broadcast(), map_broadcast(), max(), maxloc(), min(), and minloc().

2052 {
2053  if (this->size() > 1 && Attributes<T>::has_min_max == true)
2054  {
2055  T tempmin = r, tempmax = r;
2056  this->min(tempmin);
2057  this->max(tempmax);
2058  bool verified = (r == tempmin) &&
2059  (r == tempmax);
2060  this->min(verified);
2061  return verified;
2062  }
2063 
2064  static_assert(Attributes<T>::has_min_max,
2065  "Tried to verify an unverifiable type");
2066 
2067  return true;
2068 }
processor_id_type size() const
Definition: communicator.h:208
void min(const T &r, T &o, Request &req) const
Non-blocking minimum of the local value r into o with the request req.
static const bool has_min_max
Definition: attributes.h:48
void max(const T &r, T &o, Request &req) const
Non-blocking maximum of the local value r into o with the request req.

Member Data Documentation

◆ _communicator

communicator TIMPI::Communicator::_communicator
private

Definition at line 229 of file communicator.h.

Referenced by assign(), clear(), duplicate(), and get().

◆ _I_duped_it

bool TIMPI::Communicator::_I_duped_it
private

Definition at line 242 of file communicator.h.

Referenced by clear(), duplicate(), split(), and split_by_type().

◆ _max_tag

int TIMPI::Communicator::_max_tag
private

Definition at line 239 of file communicator.h.

Referenced by assign(), and get_unique_tag().

◆ _next_tag

int TIMPI::Communicator::_next_tag
mutableprivate

Definition at line 237 of file communicator.h.

Referenced by assign(), and get_unique_tag().

◆ _rank

processor_id_type TIMPI::Communicator::_rank
private

Definition at line 230 of file communicator.h.

Referenced by assign(), and rank().

◆ _send_mode

SendMode TIMPI::Communicator::_send_mode
private

Definition at line 231 of file communicator.h.

Referenced by send_mode().

◆ _size

processor_id_type TIMPI::Communicator::_size
private

Definition at line 230 of file communicator.h.

Referenced by assign(), and size().

◆ _sync_type

SyncType TIMPI::Communicator::_sync_type
private

Definition at line 232 of file communicator.h.

Referenced by sync_type().

◆ used_tag_values

std::map<int, unsigned int> TIMPI::Communicator::used_tag_values
mutableprivate

Definition at line 236 of file communicator.h.

Referenced by dereference_unique_tag(), get_unique_tag(), and reference_unique_tag().


The documentation for this class was generated from the following files: