[mvapich-discuss] Re: Errors running mvapich2

Amit H Kumar AHKumar at odu.edu
Sat Sep 30 13:17:39 EDT 2006


Hi Abhinav,

Please see inline comments and thank you for your feedback.


Abhinav Vishnu <vishnu at cse.ohio-state.edu> wrote on 09/30/2006 01:10:15 PM:

>
> > Using MVAPICH2-0-9.5
> > I am trying to run osu_latency benchmark. I end up with the following
> > error.
> > Running it as a user. Not sure if this is some kind of permission
problem ?
> > I am using SilverStorm InfinIO 3000 switch and its VAPI library.
> >
> > #> mpiexec -machinefile machines -n 3 ./osu_latency.mvapich
> > [rdma_iba_priv.c:586] error(-246): cannot query HCA
> > rank 0 in job 3  compute-0-10.local_33860   caused collective abort of
all
> > ranks
> >   exit status of rank 0: killed by signal 9
> >
>
> I think it looks like a problem due to the InfiniBand drivers not being
> up. There is a verbs level utility perf_main which is used for
> communication below MPI layer. Are you able to communicate using
> perf_main? Please let us know the outcome.

I am not sure if this output makes any sense. I tried between 2 nodes and I
get this unbelievable BW.


RECEIVER OUTPUT
===============

 ./perf_main -a172.25.22.254 --ibp=2

********************************************
*********  perf_main version 9.6   *********
*********  CPU is: 2190.82 Mcps     *********
********************************************



************* RC BW Unidirection Test started for port 2
*********************


************* RC BW Unidirection Test Finished for port 2
*********************


SENDER OUTPUT
=============

./perf_main --send -trc -mbw -s1048576 -n1000 --ibp=2

********************************************
*********  perf_main version 9.6   *********
*********  CPU is: 2190.90 Mcps     *********
********************************************



************* RC BW Unidirection Test started for port 2
*********************

BW: 2190898944000.0 MBytes/sec [size: 1048576 bytes, iter: 1000, total
1048576000]

************* RC BW Unidirection Test Finished for port 2
*********************


Is this justified output?

Thank you,
Amit



>
> Thanks and regards,
>
> -- Abhinav
>
>  >
> > Thank you for any feedback or help,
> > -Amit
> > =================================================
> >
> >
>



More information about the mvapich-discuss mailing list