[mvapich-discuss] Heterogeneous fabric, how to debug

Devendar Bureddy bureddy at cse.ohio-state.edu
Fri Nov 9 13:48:50 EST 2012


Hi Jeff

Thanks for reporting this.  It seems there is an issue in running it in
this heterogeneous mode.  We will fix this in the next releases.

Thanks
Devendar


On Thu, Nov 8, 2012 at 9:06 AM, Jeff Hanson <jefhanson at gmail.com> wrote:

> I have a test platform with two nodes, one with a dual port card and one
> with two single port cards. With SGI MPT I can use all four ports but with
> mvapich I can't figure out how to use them.  If I set export MV2_NUM_HCAS=2
> and run
>
> /data1/mvapich2/1.9a/bin/mpirun -np 4 -hosts admin,n013 -verbose
> /data1/mvapich2/1.9a/osu-micro-benchmarks-3.7/libexec/osu-micro-benchmarks/mpi/collective/osu_allgather
>
> This hangs at
> mpiexec at admin] [pgid: 0] got PMI command: cmd=barrier_in
> [mpiexec at admin] [pgid: 0] got PMI command: cmd=barrier_in
> [mpiexec at admin] PMI response to fd 6 pid 5: cmd=barrier_out
> [mpiexec at admin] PMI response to fd 7 pid 5: cmd=barrier_out
> [proxy:0:0 at admin] PMI response: cmd=barrier_out
> [proxy:0:0 at admin] PMI response: cmd=barrier_out
> [proxy:0:1 at n013] got pmi command (from 4): barrier_in
>
> [proxy:0:1 at n013] got pmi command (from 5): barrier_in
>
> [proxy:0:1 at n013] forwarding command (cmd=barrier_in) upstream
> [proxy:0:1 at n013] PMI response: cmd=barrier_out
> [proxy:0:1 at n013] PMI response: cmd=barrier_out
>
> Suggestions on how to debug further or what variables to use?
>
> Jeff Hanson
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>


-- 
Devendar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20121109/f75096f1/attachment.html


More information about the mvapich-discuss mailing list