[mvapich-discuss] MPI_Comm_spawn always uses just 1 host

Rutger Hofman rutger at cs.vu.nl
Mon Feb 13 08:06:37 EST 2012


On 02/12/2012 05:42 PM, Jonathan Perkins wrote:
> On Sat, Feb 11, 2012 at 10:15 AM, Rutger Hofman<rutger at cs.vu.nl>  wrote:
>> On 02/10/2012 11:26 PM, Jonathan Perkins wrote:
>>>
>>> On 02/10/2012 10:01 AM, Rutger Hofman wrote:
>>>>
>>>> Good afternoon list,
>>>>
>>>> I want to control the hosts on which processes are spawned by
>>>> MPI_Comm_spawn.
>>>>
>>>> But in our cluster (DAS4 http://www.das4.cs.vu.nl RHEL6 w/ QDR
>>>> Mellanox Infiniband), a call to MPI_Comm_spawn() does spawn processes,
>>>> but they always all run on one single host (something like rank
>>>> (n+1)%size when I do -np n).
>>>>
>>>> Behaviour is the same whatever MPI_Info properties I specify to the
>>>> MPI_Comm_spawn() call:
>>>> "host" "node048"
>>>> "hostfile" "hostfile"
>>>> "hosts" "node048 node049 node050"
>>>> etc etc. None of them generates an error; does that mean they are
>>>> actually valid?
>>>>
>>>> My mpirun_rsh incantation is something like
>>>> $ mpirun_rsh -ssh -np 3 -hostfile hostfile \
>>>> MV2_SUPPORT_DPM=1 ./spawn ./slave
>>>> and, as stated, slave instances do run, only I have no control where.
>>>>
>>>> I tried the mvapich2-1.6 packaged with our cluster, and a freshly
>>>> downloaded mvapich2-1.8a2. The behaviour is the same.
>>>>
>>>> In the source code I see mention of -spawnfile. This option is
>>>> rejected by mpirun_rsh.
>>>>
>>>> So, my question: how can I specify the set of hosts that processes
>>>> should be spawned on?
>>>>
>>>
>>> Hello, in order to control which processes are spawned on which hosts
>>> you can modify your hostfile.
>>>
>>> In your mpirun_rsh invocation I see that you start out with 3 processes.
>>> If your hostfile contains
>>>
>>> nodeA
>>> nodeB
>>> nodeC
>>> nodeD
>>> nodeE
>>>
>>> then the first spawned process will be on nodeD and the next spawned
>>> process will be on nodeE. Let me know if this helps.
>>>
>>> --
>>> Jonathan Perkins
>>> http://www.cse.ohio-state.edu/~perkinjo
>>>
>>
>> Thanks, this is what I had figured out too.

Sorry, I must correct this. What I see is that *all* spawned processes 
run on nodeD, even from multiple subsequent MPI_Comm_spawn() calls. So, 
what else can I do to create processes on machines where I want them? 
Would MPI_Comm_accept/connect work over Infiniband?

Rutger


More information about the mvapich-discuss mailing list