[mvapich-discuss] Creating an intra-communicator

Hari Subramoni subramoni.1 at osu.edu
Thu Sep 4 06:40:25 EDT 2014


Hi David,

Apologies for the delay in getting back to you on this. This seems to be
the expected behavior as far as this code goes. We tried the code with
MPICH and the behavior observed is the same. Is this causing any problems
for your application?

Thanks,
Hari.

On Thursday, August 28, 2014, David Winslow <
david.winslow at serendipitynow.com> wrote:

> Thanks. Below is the information requested.
>
> We are running with 20 servers with 2 processes each server (total 36
> processes)
>
> hostfile looks like:
> hostname1:2
> hostname2:2
> ...
> hostname19:2
> hostname20:2
>
> The code in the main:
>
> int main(int argc, char *argv[])
> {
> int numprocs, my_rank = 0;
>  if (strcmp(mpi_lib, "impi") == 0 || strcmp(mpi_lib, "mvapich2") == 0)
> {
>  int provided;
> MPI_Init_thread(&argc, &argv, MPI_THREAD_MULTIPLE, &provided);
>  if (provided == MPI_THREAD_MULTIPLE)
> {
>  printf("in process: %d, file: %s, function: %s, line: %d, we have
> initialized the mpi library: %s with multi-threaded support.\n", my_rank,
> __FILE__, __FUNCTION__, __LINE__, mpi_lib);
>  }
> }
>
>  MPI_Comm_size(MPI_COMM_WORLD, &numprocs);
> MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
>
> // create intra communicator
>         MPI_Comm intra_communicator = MPI_COMM_WORLD;
>  int component_id = 0;
>         MPI_Comm thread_intra_communicator;
>
> MPI_Group thread_group;
>
>  MPI_Comm_group(intra_communicator, &thread_group);
> MPI_Comm_create(intra_communicator, thread_group,
> &thread_intra_communicator);
>
> if (thread_intra_communicator == MPI_COMM_NULL)
> {
>  printf("in process: %d, file: %s, function: %s, line: %d, failed to
> create a new thread_intra_communicator for component id: %d\n", my_rank,
> __FILE__, __FUNCTION__, __LINE__, component_id);
>
> }
> else
>  {
> printf("in process: %d, file: %s, function: %s, line: %d, successfully
> created a new thread_intra_communicator: %x, for component id: %d, group
> id: %x, with number of processes: %d\n", my_rank, __FILE__, __FUNCTION__,
> __LINE__, thread_intra_communicator, component_id, thread_group, numprocs);
>  }
>
> The second entity is the intracummunicator thread_intra_communicator
>
>
>  0 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  1 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  2 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  3 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  4 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  5 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  6 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  7 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  8 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  9 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  10 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  11 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  12 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  13 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  14 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  15 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  16 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  17 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  18 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  19 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  20 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  21 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  22 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  23 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  24 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  25 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  26 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  27 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  28 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  29 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  30 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  31 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  32 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  33 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  34 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  35 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  36 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  37 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>  38 successfully created a new thread_intra_communicator 84000004 group id
> 88000000 with number of processes 40
>  39 successfully created a new thread_intra_communicator 84000002 group id
> 88000000 with number of processes 40
>
>
>
> I would expect that all thread_intra_communicators would have the same
> context id but they don't. The first number is the rank.
>
>
> Thank you for the assistance.
>
>
>
> On Thu, Aug 28, 2014 at 12:54 PM, Akshay Venkatesh <
> akshay.v.3.14 at gmail.com
> <javascript:_e(%7B%7D,'cvml','akshay.v.3.14 at gmail.com');>> wrote:
>
>> David,
>>
>> The following will helps:
>> - How many processes are being run on each node ?
>> - At which point in the MPI program are you creating the communicator?
>> - What does the second entity (c4000000/84000004) in the output
>> correspond to?
>> - A reproducer would be necessary
>>
>>
>> On Mon, Aug 25, 2014 at 11:45 PM, David Winslow <
>> david.winslow at serendipitynow.com
>> <javascript:_e(%7B%7D,'cvml','david.winslow at serendipitynow.com');>>
>> wrote:
>>
>>>
>>> We have 18 processes in the cluster and we make a single call to create
>>> an intra-communicator for a thread. Should we expect to see two different
>>> ids for them? See below output from the print.
>>>
>>> rank 0, c4000000, for component id 1, group id 88000000
>>> rank 1, 84000004, for component id 1, group id 88000000
>>> rank 2, c4000000, for component id 1, group id 88000000
>>> rank 3, 84000004, for component id 1, group id 88000000
>>> rank 4, c4000000, for component id 1, group id 88000000
>>> rank 5, 84000004, for component id 1, group id 88000000
>>> rank 6, c4000000, for component id 1, group id 88000000
>>> rank 7, 84000004, for component id 1, group id 88000000
>>> rank 8, c4000000, for component id 1, group id 88000000
>>> rank 9, 84000004, for component id 1, group id 88000000
>>> rank 10, c4000000, for component id 1, group id 88000000
>>> rank 11, 84000004, for component id 1, group id 88000000
>>> rank 12, c4000000, for component id 1, group id 88000000
>>> rank 13, 84000004, for component id 1, group id 88000000
>>>  rank 14, c4000000, for component id 1, group id 88000000
>>> rank 15, 84000004, for component id 1, group id 88000000
>>> rank 16, c4000000, for component id 1, group id 88000000
>>> rank 17, 84000004, for component id 1, group id 88000000
>>>
>>>
>>> Thanks
>>> David
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> mvapich-discuss mailing list
>>> mvapich-discuss at cse.ohio-state.edu
>>> <javascript:_e(%7B%7D,'cvml','mvapich-discuss at cse.ohio-state.edu');>
>>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>
>>>
>>
>>
>> --
>> -Akshay
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140904/fd015d9c/attachment-0001.html>


More information about the mvapich-discuss mailing list