[mvapich-discuss] troubles in running MPI job over RoCE, with mvapich2-1.6 shipped with OFED1.5.3.2

Devendar Bureddy bureddy at cse.ohio-state.edu
Thu Feb 21 10:08:02 EST 2013


Hi Devesh

Can you make sure following things?

- In mvapich2-1.6, the run-time parameter for RoCE support
was MV2_USE_RDMAOE. This is renamed later in mvapich2-1.8.

 -  I'm not sure if this is copy/paste issue. The way to specify run-time
parameters is "<param_name>=<param_value>
   MV2_USE_RoCE-1  ===>  MV2_USE_RoCE=1
   MV2_USE_RDMA_CM-1  ===> MV2_USE_RDMA_CM=1


-Devendar

On Thu, Feb 21, 2013 at 9:35 AM, Devesh Sharma <devesh28 at gmail.com> wrote:

> Hi list,
>
> I am trying to run a simple mpi job over a 2 node cluster with RoCE
> adapter and OFED-1.5.3.2. I am facing following error. Please help
>
> [root at neo01 IMB-3.2]# /usr/mpi/gcc/mvapich2-1.6/bin/mpirun_rsh -ssh
> -debug -np 2 MV2_USE_RoCE-1 MV2_USE_RDMA_CM-1 -hostfile /opt/Work/hostfile
> /usr/mpi/gcc/mvapich2-1.6/tests/IMB-3.2/IMB-MPI1
> execv: No such file or directory
> /usr/bin/xterm -e /usr/bin/ssh -q MV2_USE_RoCE-1 cd
> /usr/mpi/gcc/mvapich2-1.6/tests/IMB-3.2; /usr/bin/env
> MPISPAWN_MPIRUN_MPD=0 USE_LINEAR_SSH=1 MPISPAWN_MPIRUN_HOST=neo01
> MPIRUN_RSH_LAUNCH=1 MPISPAWN_CHECKIN_PORT=53250 MPISPAWN_MPIRUN_PORT=53250
> MPISPAWN_NNODES=2 MPISPAWN_GLOBAL_NPROCS=2 MPISPAWN_MPIRUN_ID=23270
> MPISPAWN_ARGC=3 MPISPAWN_ARGV_0=/usr/bin/gdb
> MPDMAN_KVS_TEMPLATE=kvs_255_neo01_23270 MPISPAWN_LOCAL_NPROCS=1
> MPISPAWN_ARGV_1=-hostfile MPISPAWN_ARGV_2=/opt/Work/hostfile
> MPISPAWN_ARGV_3=/usr/mpi/gcc/mvapich2-1.6/tests/IMB-3.2/IMB-MPI1
> MPISPAWN_GENERIC_ENV_COUNT=0  MPISPAWN_ID=0
> MPISPAWN_WORKING_DIR=/usr/mpi/gcc/mvapich2-1.6/tests/IMB-3.2
> MPISPAWN_MPIRUN_RANK_0=0 MPISPAWN_VIADEV_DEFAULT_PORT_0=-1
> /usr/mpi/gcc/mvapich2-1.6/bin/mpispawn 0 execv: No such file or directory
> (null) I��H��|5 (null)
> /usr/bin/xterm -e /usr/bin/ssh -q MV2_USE_RDMA_CM-1 cd
> /usr/mpi/gcc/mvapich2-1.6/tests/IMB-3.2; /usr/bin/env
> MPISPAWN_MPIRUN_MPD=0 USE_LINEAR_SSH=1 MPISPAWN_MPIRUN_HOST=neo01
> MPIRUN_RSH_LAUNCH=1 MPISPAWN_CHECKIN_PORT=53250 MPISPAWN_MPIRUN_PORT=53250
> MPISPAWN_NNODES=2 MPISPAWN_GLOBAL_NPROCS=2 MPISPAWN_MPIRUN_ID=23270
> MPISPAWN_ARGC=3 MPISPAWN_ARGV_0=/usr/bin/gdb
> MPDMAN_KVS_TEMPLATE=kvs_255_neo01_23270 MPISPAWN_LOCAL_NPROCS=1
> MPISPAWN_ARGV_1=-hostfile MPISPAWN_ARGV_2=/opt/Work/hostfile
> MPISPAWN_ARGV_3=/usr/mpi/gcc/mvapich2-1.6/tests/IMB-3.2/IMB-MPI1
> MPISPAWN_GENERIC_ENV_COUNT=0  MPISPAWN_ID=1
> MPISPAWN_WORKING_DIR=/usr/mpi/gcc/mvapich2-1.6/tests/IMB-3.2
> MPISPAWN_MPIRUN_RANK_0=1 MPISPAWN_VIADEV_DEFAULT_PORT_0=-1
> /usr/mpi/gcc/mvapich2-1.6/bin/mpispawn 0 (null) I��H��|5 (null)
> child_handler: Error in init phase...wait for cleanup! (0/2mpispawn
> connections)
> child_handler: Error in init phase...wait for cleanup! (0/2mpispawn
> connections)
>
> -Best Regards
>  Devesh
>
> --
> Please don't print this E- mail unless you really need to - this will
> preserve trees on planet earth.
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>


-- 
Devendar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20130221/65ba6e76/attachment.html


More information about the mvapich-discuss mailing list