[mvapich-discuss] MPI_Bcast
Mayhall, Anthony J. (MSFC-ES53)[TBE-1]
anthony.j.mayhall at nasa.gov
Wed Nov 18 13:47:19 EST 2009
We have only one executable on each node that is executed by
mpirun_rsh. The root broadcasts to only one executable on each of the
other nodes. We have other executables that communicate with those
via shared memory, but are not run using mpirun_rsh. The numbers we
are seeing are not in comparison to other mvapich builds. We need a
faster method of doing the broadcast. It looks like the mcast option
may work better for our app. We will try it and see.
Thanks,
Anthony Mayhall
256-684-1094
On Nov 18, 2009, at 12:35 PM, "Krishna Chaitanya Kandalla" <kandalla at cse.ohio-state.edu
> wrote:
> Anthony,
> Previously, we had explored the IB multicast option to
> implement a few of the collectives in MVAPICH. However, we found
> that it was not a scalable option. In MVAPICH2, we implement all
> collective operations using either point-to-point or shared-memory
> based algorithms.
> Getting back to your question, for the message sizes that you
> have mentioned (40K), we are currently using a shared-memory based
> algorithm to implement the MPI_Bcast operation. Which earlier
> version of MVAPICH/MVAPICH2 are you comparing these results with and
> what is the performance difference that you are observing? Also, you
> mentioned having to broadcast to one executable on each node, does
> your job involve running different executables on each node? Or do
> you mean you are having just one process running on each node?
>
> Thanks,
> Krishna
>
>
>
>
> Mayhall, Anthony J. (MSFC-ES53)[TBE-1] wrote:
>> How is MPI_Bcast implemented in mvapich 1.4? Can it use IB
>> multicast? If so, how do you turn that on? It is currently taking
>> our apps a lot longer to broadcast using MPI_Bcast to 10 nodes
>> (175us for 40K) vs. 2 (46us for 40K) nodes. We are only
>> broadcasting to one executable on each node. A multicast should
>> take the same amount of time regardless of number of nodes wouldn't
>> it? I used the multicast_test.c code to test IB multicast and it
>> seems to not matter much how many nodes are being used as far as
>> the timing goes. Can the Bcast buffer be broken into multiple 2048
>> byte transfers in MPI_Bcast to use IB multicast? I thought I saw
>> in a white paper that these methods were being taken advantage of
>> in mvapich.
>>
>> Or are we just doing something wrong?
>>
>> Thanks,
>>
>> Anthony Mayhall
>> Davidson Technologies, Inc.
>> (256)544-7620
>>
>>
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>>
>>
More information about the mvapich-discuss
mailing list