[mvapich-discuss] segmentation fault with GPUDirect

Jiri Kraus jkraus at nvidia.com
Mon Apr 11 13:18:40 EDT 2016


Hi Enrico,

Are you enabling CUDA-aware Features by setting the env. var. MV2_USE_CUDA=1?

Jiri

> -----Original Message-----
> From: mvapich-discuss [mailto:mvapich-discuss-bounces at cse.ohio-
> state.edu] On Behalf Of Enrico Calore
> Sent: Montag, 11. April 2016 10:12
> To: mvapich-discuss at cse.ohio-state.edu
> Subject: [mvapich-discuss] segmentation fault with GPUDirect
> 
> * PGP Signed by an unknown key
> 
> Hi all,
> we are configuring a small cluster at our University, equipped with SLURM,
> Mellanox IB cards and NVIDIA GPUs.
> The OS is a CentOS7 and we are willing to use mvapich2 with SLURM;
> therefore we installed the specific MVAPICH2-GDR 2.2b rpm.
> 
> We noticed that when trying to compile and run some of our test programs
> they all fail with a Segmentation Fault, if trying to use GPUDirect-RDMA.
> To debug the problem we run the osu-micro-benchmarks provided in the rpm
> package and they seem to work smoothly; e.g running:
> 
> /opt/mvapich2/gdr/2.2/cuda7.5/gnu/libexec/osu-micro-
> benchmarks/mpi/pt2pt/osu_bibw
> -d cuda D D
> 
> Despite of this, if we try to recompile the benchmarks on our own, their
> behavior is the same as all of our codes; e.g this works smoothly:
> 
> osu_bibw -d cuda H H
> 
> while this fails with a Segmentation Fault:
> 
> osu_bibw -d cuda D D
> 
> To configure the bechmarks we used the following command line:
> ./configure CC=/opt/mvapich2/gdr/2.2/cuda7.5/gnu/bin/mpicc
> CXX=/opt/mvapich2/gdr/2.2/cuda7.5/gnu/bin/mpicxx --enable-cuda
> --with-cuda-libpath=/opt/nvidia/cuda-7.5/lib64
> --with-cuda-include=/opt/nvidia/cuda-7.5/include/
> 
> Do you have any hints about what could be causing this problem?
> Or, do you have any hints about how could we debug it?
> 
> As a side question that may help us to understand what we are doing
> wrong: are the options used to configure/compile the rpms available
> somewhere?
> 
> 
> Thanks in Advance and
> Best Regards,
> 
> Enrico
> 
> 
> 
> 
> * Unknown Key
> * 0x07C820AA

NVIDIA GmbH, Wuerselen, Germany, Amtsgericht Aachen, HRB 8361
Managing Director: Karen Theresa Burns

-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------



More information about the mvapich-discuss mailing list