[mvapich-discuss] MPI_Bcast

Mayhall, Anthony J. (MSFC-ES53)[TBE-1] anthony.j.mayhall at nasa.gov
Wed Nov 18 15:55:30 EST 2009


We have 12 cores per system.  No shared memory transfers occur until  
the broadcast completes.  I have tried some of the options and haven't  
seen any real effect in the timing.


Anthony Mayhall
256-684-1094

On Nov 18, 2009, at 2:49 PM, "Dhabaleswar Panda" <panda at cse.ohio-state.edu 
 > wrote:

> Anthony - Since you are using additional executables which are also
> communicating via shared memory, I am wondering whether the  
> performance is
> getting degraded by concurrent communication over shared memory. As
> Krishna indicated, the latest version is using shared-memory  
> broadcast.
> How many cores are per node on your system? Do you see this bahavior  
> for
> less than 10 nodes?
>
> Please note that in MVAPICH2 1.4, we have introduced multiple runtime
> parameters to select the broadcast scheme for a given environment -  
> pure
> pt-to-pt-based scheme vs. shared memory-based scheme. Details are
> available at the following location in MVAPICH2 user guide. You can
> try some of these options and let us know whether the problem gets
> resolved on your set-up.
>
> http://mvapich.cse.ohio-state.edu/support/user_guide_mvapich2-1.4.html#x1-8600010.8
>
> Thanks,
>
> DK
>
>
>
> On Wed, 18 Nov 2009, Mayhall, Anthony J. (MSFC-ES53)[TBE-1] wrote:
>
>> We have only one executable on each node that is executed by
>> mpirun_rsh.  The root broadcasts to only one executable on each of  
>> the
>> other nodes.  We have other executables that communicate with those
>> via shared memory, but are not run using mpirun_rsh.  The numbers we
>> are seeing are not in comparison to other mvapich builds.  We need a
>> faster method of doing the broadcast.  It looks like the mcast option
>> may work better for our app.  We will try it and see.
>>
>> Thanks,
>>
>>
>> Anthony Mayhall
>> 256-684-1094
>>
>> On Nov 18, 2009, at 12:35 PM, "Krishna Chaitanya Kandalla" <kandalla at cse.ohio-state.edu
>>> wrote:
>>
>>> Anthony,
>>>       Previously, we had explored the IB multicast option to
>>> implement a few of the collectives in MVAPICH. However, we found
>>> that it was not a scalable option. In MVAPICH2, we implement all
>>> collective operations using either point-to-point or shared-memory
>>> based algorithms.
>>>       Getting back to your question, for the message sizes that you
>>> have mentioned (40K), we are currently using a shared-memory based
>>> algorithm to implement the MPI_Bcast operation. Which earlier
>>> version of MVAPICH/MVAPICH2 are you comparing these results with and
>>> what is the performance difference that you are observing? Also, you
>>> mentioned having to broadcast to one executable on each node, does
>>> your job involve running different executables on each node? Or do
>>> you mean you are having just one process running on each node?
>>>
>>> Thanks,
>>> Krishna
>>>
>>>
>>>
>>>
>>> Mayhall, Anthony J. (MSFC-ES53)[TBE-1] wrote:
>>>> How is MPI_Bcast implemented in mvapich 1.4?  Can it use IB
>>>> multicast?  If so, how do you turn that on?  It is currently taking
>>>> our apps a lot longer to broadcast using MPI_Bcast to 10 nodes
>>>> (175us for 40K) vs. 2 (46us for 40K) nodes.  We are only
>>>> broadcasting to one executable on each node. A multicast should
>>>> take the same amount of time regardless of number of nodes wouldn't
>>>> it?  I used the multicast_test.c code to test IB multicast and it
>>>> seems to not matter much how many nodes are being used as far as
>>>> the timing goes.  Can the Bcast buffer be broken into multiple 2048
>>>> byte transfers in MPI_Bcast to use IB multicast?  I thought I saw
>>>> in a white paper that these methods were being taken advantage of
>>>> in mvapich.
>>>>
>>>> Or are we just doing something wrong?
>>>>
>>>> Thanks,
>>>>
>>>> Anthony Mayhall
>>>> Davidson Technologies, Inc.
>>>> (256)544-7620
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> mvapich-discuss mailing list
>>>> mvapich-discuss at cse.ohio-state.edu
>>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>>
>>>>
>>>>
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>



More information about the mvapich-discuss mailing list