[mvapich-discuss] Heterogeneous fabric, how to debug

Jeff Hanson jefhanson at gmail.com
Thu Nov 8 09:06:21 EST 2012


I have a test platform with two nodes, one with a dual port card and one
with two single port cards. With SGI MPT I can use all four ports but with
mvapich I can't figure out how to use them.  If I set export MV2_NUM_HCAS=2
and run

/data1/mvapich2/1.9a/bin/mpirun -np 4 -hosts admin,n013 -verbose
/data1/mvapich2/1.9a/osu-micro-benchmarks-3.7/libexec/osu-micro-benchmarks/mpi/collective/osu_allgather

This hangs at
mpiexec at admin] [pgid: 0] got PMI command: cmd=barrier_in
[mpiexec at admin] [pgid: 0] got PMI command: cmd=barrier_in
[mpiexec at admin] PMI response to fd 6 pid 5: cmd=barrier_out
[mpiexec at admin] PMI response to fd 7 pid 5: cmd=barrier_out
[proxy:0:0 at admin] PMI response: cmd=barrier_out
[proxy:0:0 at admin] PMI response: cmd=barrier_out
[proxy:0:1 at n013] got pmi command (from 4): barrier_in

[proxy:0:1 at n013] got pmi command (from 5): barrier_in

[proxy:0:1 at n013] forwarding command (cmd=barrier_in) upstream
[proxy:0:1 at n013] PMI response: cmd=barrier_out
[proxy:0:1 at n013] PMI response: cmd=barrier_out

Suggestions on how to debug further or what variables to use?

Jeff Hanson
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20121108/c0c0d479/attachment.html


More information about the mvapich-discuss mailing list