[mvapich-discuss] MVAPICH2 with CUDA build problems

Devendar Bureddy bureddy at cse.ohio-state.edu
Tue Mar 20 14:22:56 EDT 2012


Hi Jens

CUDA features in MVAPICH2 are not supported with QLogic PSM interface.

-Devendar

On Tue, Mar 20, 2012 at 12:08 PM, Jens Glaser <jglaser at umn.edu> wrote:
> Hi,
>
> I am trying to use MVAPICH2 with CUDA RDMA support on our linux cluster.
> When I do
>
> ./configure --prefix=/home/it1/glaser/mpich2-install --enable-cuda --with-cuda-include=/usr/local/cuda --with-cuda-libpath=/usr/local/cuda/lib64/ --enable-shared --with-ib-libpath=/usr/lib64
>
> and do "make", "make install" I obtain a MPICH2 installation, however it crashes upon initializing MPI in my program, with the error message
>
> [cas002:mpi_rank_0][rdma_find_network_type] QLogic IB card detected in system
> [cas002:mpi_rank_0][rdma_find_network_type] Please re-configure the library with the '--with-device=ch3:psm' configure option for best performance
> [cas002:mpi_rank_0][error_sighandler] Caught error: Segmentation fault (signal 11)
> Segmentation fault (core dumped)
>
> If I follow the advice, and configure with
>
> ./configure --prefix=/home/it1/glaser/mpich2-install --enable-cuda --with-cuda-include=/usr/local/cuda --with-cuda-libpath=/usr/local/cuda/lib64/ --enable-shared --with-ib-libpath=/usr/lib64 --with-device=ch3:psm
>
> and perform "make", I get the following compile error:
>
> make[4]: Entering directory `/home/it1/glaser/mvapich2-1.8a2/src/mpid/ch3/src'
>  CC              ch3u_buffer.c
>  CC              ch3u_comm_spawn_multiple.c
>  CC              ch3u_handle_connection.c
>  CC              ch3u_handle_recv_pkt.c
> ch3u_handle_recv_pkt.c: In function \u2018MPIDI_CH3U_Receive_data_found\u2019:
> ch3u_handle_recv_pkt.c:292: error: \u2018rdma_enable_cuda\u2019 undeclared (first use in this function)
> ch3u_handle_recv_pkt.c:292: error: (Each undeclared identifier is reported only once
> ch3u_handle_recv_pkt.c:292: error: for each function it appears in.)
> ch3u_handle_recv_pkt.c:346: error: \u2018MPID_Request\u2019 has no member named \u2018mrail\u2019
> ch3u_handle_recv_pkt.c:346: error: \u2018DEVICE_TO_DEVICE\u2019 undeclared (first use in this function)
>
> Does anyone have an idea how to make it work?
> Just as a side node, on a different cluster (without Mellanox IB adapters) I have been succesfully able to use MVAPCH2 with cuda support.
>
> thanks
> Jens
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss



-- 
Devendar



More information about the mvapich-discuss mailing list