[mvapich-discuss] MVAPICH2 over IB: help with configuration needed

Hari Subramoni subramon at cse.ohio-state.edu
Sat Mar 13 22:20:28 EST 2010


Hi Lukasz,

Could you please give us some more information on the kind of system you
have? (like Platform, OS, IB HCAs)

I assume you have OFED installed on your system. If so, could you please
execute the 'ibstat' command on the all hosts you're trying to run
MVAPICH2 on and let us know what the output is? This will tell us the
state of the port (Active/Down etc).

Also (if you've OFED installed), could you please execute a native IB
level latency test between any two of the nodes on which you're trying to
run MVAPICH2. The command is as follows.

To start the server : ib_send_lat -a
To start the client : ib_send_lat -a <server hostname>

Thx,
Hari.

On Sat, 13 Mar 2010, Lukasz Majewski wrote:

> Dear list,
> I am stuck, unable to get MVAPICH2 to run over IB. I am out of ideas
> and can't find a good troubleshooting guide. Here is the error:
>
> Fatal error in MPI_Init:
> Other MPI error, error stack:
> MPIR_Init_thread(311)....: Initialization failed
> MPID_Init(191)...........: channel initialization failed
> MPIDI_CH3_Init(163)......:
> MPIDI_CH3I_RDMA_init(225):
> rdma_iba_hca_init(693)...: cannot create cq
> MPI process (rank: 0) terminated unexpectedly on ....
>
> Some output from troubleshooting suggestions found on this mailing list:
>
>  /sbin/lsmod | grep ib
> ib_iser                42240  0
> libiscsi               44928  1 ib_iser
> scsi_transport_iscsi    45088  2 ib_iser,libiscsi
> rdma_cm                40052  2 ib_iser,rdma_ucm
> ib_addr                15880  1 rdma_cm
> ib_ucm                 22696  0
> ib_uverbs              42416  2 rdma_ucm,ib_ucm
> ib_umad                23080  0
> ib_ipoib               95584  0
> ib_cm                  47784  3 rdma_cm,ib_ucm,ib_ipoib
> ib_sa                  32256  3 rdma_cm,ib_ipoib,ib_cm
> inet_lro               15872  1 ib_ipoib
> ib_mthca              147300  0
> ib_mad                 49960  4 ib_umad,ib_cm,ib_sa,ib_mthca
> ib_core                76928  12
> ib_iser,rdma_ucm,rdma_cm,iw_cm,ib_ucm,ib_uverbs,ib_umad,ib_ipoib,ib_cm,ib_sa,ib_mthca,ib_mad
>
> ulimit -l
> 64
>
>
> Please help me configure the system correctly.
>
> Lucas Majewski
> Graduate Student at Illinois Institute of Technology
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>



More information about the mvapich-discuss mailing list