[mvapich-discuss] MPI_Comm_spawn always uses just 1 host

Rutger Hofman rutger at cs.vu.nl
Mon Feb 13 08:45:23 EST 2012


On 02/13/2012 02:06 PM, Rutger Hofman wrote:
> On 02/12/2012 05:42 PM, Jonathan Perkins wrote:
>> On Sat, Feb 11, 2012 at 10:15 AM, Rutger Hofman<rutger at cs.vu.nl> wrote:
>>> On 02/10/2012 11:26 PM, Jonathan Perkins wrote:
>>>>
>>>> On 02/10/2012 10:01 AM, Rutger Hofman wrote:
>>>>>
>>>>> Good afternoon list,
>>>>>
>>>>> I want to control the hosts on which processes are spawned by
>>>>> MPI_Comm_spawn.
>>>>>
>>>>> But in our cluster (DAS4 http://www.das4.cs.vu.nl RHEL6 w/ QDR
>>>>> Mellanox Infiniband), a call to MPI_Comm_spawn() does spawn processes,
>>>>> but they always all run on one single host (something like rank
>>>>> (n+1)%size when I do -np n).
>>>>>
>>>>> Behaviour is the same whatever MPI_Info properties I specify to the
>>>>> MPI_Comm_spawn() call:
>>>>> "host" "node048"
>>>>> "hostfile" "hostfile"
>>>>> "hosts" "node048 node049 node050"
>>>>> etc etc. None of them generates an error; does that mean they are
>>>>> actually valid?
>>>>>
>>>>> My mpirun_rsh incantation is something like
>>>>> $ mpirun_rsh -ssh -np 3 -hostfile hostfile \
>>>>> MV2_SUPPORT_DPM=1 ./spawn ./slave
>>>>> and, as stated, slave instances do run, only I have no control where.
>>>>>
>>>>> I tried the mvapich2-1.6 packaged with our cluster, and a freshly
>>>>> downloaded mvapich2-1.8a2. The behaviour is the same.
>>>>>
>>>>> In the source code I see mention of -spawnfile. This option is
>>>>> rejected by mpirun_rsh.
>>>>>
>>>>> So, my question: how can I specify the set of hosts that processes
>>>>> should be spawned on?
>>>>>
>>>>
>>>> Hello, in order to control which processes are spawned on which hosts
>>>> you can modify your hostfile.
>>>>
>>>> In your mpirun_rsh invocation I see that you start out with 3
>>>> processes.
>>>> If your hostfile contains
>>>>
>>>> nodeA
>>>> nodeB
>>>> nodeC
>>>> nodeD
>>>> nodeE
>>>>
>>>> then the first spawned process will be on nodeD and the next spawned
>>>> process will be on nodeE. Let me know if this helps.
>>>>
>>>> --
>>>> Jonathan Perkins
>>>> http://www.cse.ohio-state.edu/~perkinjo
>>>>
>>>
>>> Thanks, this is what I had figured out too.
>
> Sorry, I must correct this. What I see is that *all* spawned processes
> run on nodeD, even from multiple subsequent MPI_Comm_spawn() calls. So,
> what else can I do to create processes on machines where I want them?
> Would MPI_Comm_accept/connect work over Infiniband?

One more note. I try mpiexec.hydra and it behaves as you describe.

Rutger Hofman
VU Amsterdam



More information about the mvapich-discuss mailing list