[EXTERNAL] Re: [mvapich-discuss] MV2_USE_CUDA=1 gets ignored?

Christian Trott crtrott at sandia.gov
Thu Feb 7 17:43:28 EST 2013


Nope same behaviour.

Btw I currently have a ugly, unsafe workaround by adding a
     if(buffer>0x2000000000 && buffer<0x3000000000) return 1;
to the is_device_buffer(void* buffer) function.

Now while this is certainly not good it at least is a very good sign 
that there is nothing else wrong. Both bandwidth and latency are as 
expected between nodes (18us, 3.7GB/s). Between two cards on the same 
node its a bit disappointing latency though (52us). Is that expected? 
Its a dual socket Sandy Bridge with two K20Xm per node and FDR 
Infiniband. GPUs are hanging of different sockets.

I also had to add  --disable-mcast to the configure line to get it work.

Thanks
Christian

On 02/06/2013 11:17 AM, Devendar Bureddy wrote:
> By default osu_bw will use only one GPU on the system.  Can you try
> with get_local_rank script shipped with osu_benchmarks to use 2
> processes with two different GPUs and see if that makes any
> difference.
>
> mpirun -np 2 env MV2_USE_CUDA=1 MV2_DEBUG_SHOW_BACKTRACE=1
> ./get_local_rank ./osu_bw D D
>
> -Devendar
>
> On Wed, Feb 6, 2013 at 12:54 PM, Christian Trott<crtrott at sandia.gov>  wrote:
>> The testcode works. I modified it slightly to be able to run 2 processes on
>> two different GPUs and put the same output to it as I added to mvapich this
>> is what I get:
>>
>> memory type detected correctly
>> Test: 0 0x2700720000 0 0 2 2
>> memory type detected correctly
>> Test: 1 0x2700720000 0 0 2 2
>>
>>
>> And this was what I got for the same line with osu_bw:
>> IsDevicePointer2: 0x2700720000 1 0 0 2
>>
>> The difference is that in the mvapich code cuPointerGetAttribute throws an
>> error for actually the same address!
>>
>> Christian
>>
>>
>>
>> On 02/06/2013 10:26 AM, Devendar Bureddy wrote:
>>> Hi Christian
>>>
>>> Can you please try attached small test program to see if this(
>>> detecting GPU memory correctly) is the reason for this issue.
>>>
>>> $mpicc -o test  ./test.c
>>>
>>> $ ./test
>>> memory type detected correctly
>>>
>>> -Devendar
>>>
>>> On Wed, Feb 6, 2013 at 12:11 PM, Christian Trott<crtrott at sandia.gov>
>>> wrote:
>>>> Hi
>>>>
>>>> you mean you compiled mvapich on the compute node linking against local
>>>> files?
>>>> I am already compiling on the compute nodes, but the filesystem is an NFS
>>>> if
>>>> I am not mistaken.
>>>> Here is one more piece of info:
>>>>
>>>> I added to this file
>>>> src/mpid/ch3/channels/mrail/src/rdma/ch3_smp_progress.c
>>>> some print out in lines 2858:
>>>>
>>>> #if defined(_ENABLE_CUDA_)
>>>>           if (rdma_enable_cuda) {
>>>>               printf("Test\n");
>>>>               iov_isdev = is_device_buffer((void *) iov[i].MPID_IOV_BUF);
>>>>               printf("Test %i %p\n",iov_isdev,(void *)
>>>> iov[i].MPID_IOV_BUF);
>>>>           }
>>>>
>>>> And this is my output:
>>>>
>>>> Test
>>>> Test 0 0x7fefff4b0
>>>>
>>>> # OSU MPI-CUDA Bandwidth Test
>>>> # Send Buffer on DEVICE (D) and Receive Buffer on DEVICE (D)
>>>> # Size        Bandwidth (MB/s)
>>>> Test
>>>> Test 0 0x7feffe950
>>>> Test
>>>> Test 0 0x2d00300000
>>>> ==61548== Invalid read of size 1
>>>> ==61548==    at 0x4A08020: memcpy (mc_replace_strmem.c:628)
>>>> ==61548==    by 0x445462: MPIUI_Memcpy (mpiimpl.h:146)
>>>> ==61548==    by 0x44D5DE: MPIDI_CH3I_SMP_writev (ch3_smp_progress.c:2897)
>>>> ==61548==    by 0x5DAA44: MPIDI_CH3_SMP_iSendv (ch3_isendv.c:108)
>>>> ==61548==    by 0x5DADF9: MPIDI_CH3_iSendv (ch3_isendv.c:187)
>>>> ==61548==    by 0x5D1D7A: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:632)
>>>> ==61548==    by 0x42E22A: MPID_Isend (mpid_isend.c:220)
>>>> ==61548==    by 0x40C33F: PMPI_Isend (isend.c:122)
>>>> ==61548==    by 0x407001: main (osu_bw.c:243)
>>>> ==61548==  Address 0x2d00300000 is not stack'd, malloc'd or (recently)
>>>> free'd
>>>> ==61548==
>>>>
>>>> [k20-0001:mpi_rank_0][error_sighandler] Caught error: Segmentation fault
>>>> (signal 11)
>>>> [k20-0001:mpi_rank_0][print_backtrace]   0: ./out() [0x4b6762]
>>>> [k20-0001:mpi_rank_0][print_backtrace]   1: ./out() [0x4b689e]
>>>>
>>>> [k20-0001:mpi_rank_0][print_backtrace]   2: /lib64/libpthread.so.0()
>>>> [0x38b7a0f4a0]
>>>> [k20-0001:mpi_rank_0][print_backtrace]   3:
>>>>
>>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so(_vgrZU_libcZdsoZa_memcpy+0x160)
>>>> [0x4a08020]
>>>> [k20-0001:mpi_rank_0][print_backtrace]   4: ./out() [0x445463]
>>>> [k20-0001:mpi_rank_0][print_backtrace]   5: ./out() [0x44d5df]
>>>> [k20-0001:mpi_rank_0][print_backtrace]   6: ./out() [0x5daa45]
>>>> [k20-0001:mpi_rank_0][print_backtrace]   7: ./out() [0x5dadfa]
>>>> [k20-0001:mpi_rank_0][print_backtrace]   8: ./out() [0x5d1d7b]
>>>> [k20-0001:mpi_rank_0][print_backtrace]   9: ./out() [0x42e22b]
>>>> [k20-0001:mpi_rank_0][print_backtrace]  10: ./out() [0x40c340]
>>>> [k20-0001:mpi_rank_0][print_backtrace]  11: ./out() [0x407002]
>>>>
>>>> [k20-0001:mpi_rank_0][print_backtrace]  12:
>>>> /lib64/libc.so.6(__libc_start_main+0xfd) [0x38b6e1ecdd]
>>>> [k20-0001:mpi_rank_0][print_backtrace]  13: ./out() [0x406829]
>>>>
>>>> My guess is the address 0x2d00300000 should be on the GPU. So the
>>>> is_device_buffer test seems to fail. Maybe that is connected to the
>>>> rather
>>>> interesting settings of our machine. We got 128GB of RAM per node, of
>>>> which
>>>> apparently 64GB are configure to be used as RAMDISK for /tmp.
>>>>
>>>> Cheers
>>>> Christian
>>>>
>>>>
>>>> n 02/06/2013 09:58 AM, Joshua Anderson wrote:
>>>>> Hi Christian,
>>>>>
>>>>> I'm not sure if this is related but I get similar behavior on our
>>>>> cluster
>>>>> when I link mvapich to the libcuda.so the admins provide on an NFS
>>>>> share.
>>>>> They do this because the head nodes don't have GPUS and thus don't have
>>>>> libcuda.so. When I instead compile on the compute node and link against
>>>>> the
>>>>> libcuda.so on the local file system, the problem goes away. This is very
>>>>> strange because the two files are identical.
>>>>>
>>>>> - Josh
>>>>>
>>>>> On Feb 6, 2013, at 11:44 AM, Christian Trott wrote:
>>>>>
>>>>>> Hi
>>>>>>
>>>>>> I am trying to use GPU to GPU mpi communication on a new cluster of
>>>>>> ours,
>>>>>> and it always fails with segfaults. The funny thing is I get the same
>>>>>> valgrind output whether I use MV2_USE_CUDA=1 or not (output comes
>>>>>> further
>>>>>> down). I downloaded the most recent 1.9a2 version and this is my
>>>>>> current
>>>>>> config line:
>>>>>>
>>>>>> ./configure --enable-cuda --with-cuda=/home/crtrott/lib/cuda-5.0/
>>>>>> --prefix=/home/crtrott/mpi/mvapich2-1.9/gcc/cuda50a --disable-rdmacm
>>>>>> --disable-mcast --enable-g=dbg --disable-fast
>>>>>>
>>>>>> This is my run command:
>>>>>>
>>>>>> mpirun -np 2 env MV2_USE_CUDA=1 MV2_DEBUG_SHOW_BACKTRACE=1 valgrind
>>>>>> ./osu_bw D D
>>>>>>
>>>>>> And this is the relevant valgrind output (and as :
>>>>>>
>>>>>> ==58800== Warning: set address range perms: large range [0x3d00000000,
>>>>>> 0x5e00000000) (noaccess)
>>>>>> ==58801== Warning: set address range perms: large range [0x3d00000000,
>>>>>> 0x5e00000000) (noaccess)
>>>>>> ==58800== Warning: set address range perms: large range [0x2d00000000,
>>>>>> 0x3100000000) (noaccess)
>>>>>> ==58801== Warning: set address range perms: large range [0x2d00000000,
>>>>>> 0x3100000000) (noaccess)
>>>>>> # OSU MPI-CUDA Bandwidth Test
>>>>>> # Send Buffer on DEVICE (D) and Receive Buffer on DEVICE (D)
>>>>>> # Size        Bandwidth (MB/s)
>>>>>> ==58800== Invalid read of size 1
>>>>>> ==58800==    at 0x4A08020: memcpy (mc_replace_strmem.c:628)
>>>>>> ==58800==    by 0x4452D6: MPIUI_Memcpy (mpiimpl.h:146)
>>>>>> ==58800==    by 0x44D41E: MPIDI_CH3I_SMP_writev
>>>>>> (ch3_smp_progress.c:2895)
>>>>>> ==58800==    by 0x5DA884: MPIDI_CH3_SMP_iSendv (ch3_isendv.c:108)
>>>>>> ==58800==    by 0x5DAC39: MPIDI_CH3_iSendv (ch3_isendv.c:187)
>>>>>> ==58800==    by 0x5D1BBA: MPIDI_CH3_EagerContigIsend (ch3u_eager.c:632)
>>>>>> ==58800==    by 0x42E09E: MPID_Isend (mpid_isend.c:220)
>>>>>> ==58800==    by 0x40C1B3: PMPI_Isend (isend.c:122)
>>>>>> ==58800==    by 0x406E85: main (osu_bw.c:242)
>>>>>> ==58800==  Address 0x2d00200000 is not stack'd, malloc'd or (recently)
>>>>>> free'd
>>>>>> ==58800==
>>>>>> [k20-0001:mpi_rank_0][error_sighandler] Caught error: Segmentation
>>>>>> fault
>>>>>> (signal 11)
>>>>>> [k20-0001:mpi_rank_0][print_backtrace]   0: ./osu_bw() [0x4b65a2]
>>>>>> [k20-0001:mpi_rank_0][print_backtrace]   1: ./osu_bw() [0x4b66de]
>>>>>> [k20-0001:mpi_rank_0][print_backtrace]   2: /lib64/libpthread.so.0()
>>>>>> [0x38b7a0f4a0]
>>>>>> [k20-0001:mpi_rank_0][print_backtrace]   3:
>>>>>>
>>>>>> /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so(_vgrZU_libcZdsoZa_memcpy+0x160)
>>>>>> [0x4a08020]
>>>>>> [k20-0001:mpi_rank_0][print_backtrace]   4: ./osu_bw() [0x4452d7]
>>>>>> [k20-0001:mpi_rank_0][print_backtrace]   5: ./osu_bw() [0x44d41f]
>>>>>> [k20-0001:mpi_rank_0][print_backtrace]   6: ./osu_bw() [0x5da885]
>>>>>> [k20-0001:mpi_rank_0][print_backtrace]   7: ./osu_bw() [0x5dac3a]
>>>>>> [k20-0001:mpi_rank_0][print_backtrace]   8: ./osu_bw() [0x5d1bbb]
>>>>>> [k20-0001:mpi_rank_0][print_backtrace]   9: ./osu_bw() [0x42e09f]
>>>>>> [k20-0001:mpi_rank_0][print_backtrace]  10: ./osu_bw() [0x40c1b4]
>>>>>> [k20-0001:mpi_rank_0][print_backtrace]  11: ./osu_bw() [0x406e86]
>>>>>> [k20-0001:mpi_rank_0][print_backtrace]  12:
>>>>>> /lib64/libc.so.6(__libc_start_main+0xfd) [0x38b6e1ecdd]
>>>>>> [k20-0001:mpi_rank_0][print_backtrace]  13: ./osu_bw() [0x4066a9]
>>>>>>
>>>>>> Any suggestions would be greatly appreciated.
>>>>>>
>>>>>> Christian
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> mvapich-discuss mailing list
>>>>>> mvapich-discuss at cse.ohio-state.edu
>>>>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>>>
>>>> _______________________________________________
>>>> mvapich-discuss mailing list
>>>> mvapich-discuss at cse.ohio-state.edu
>>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>
>>>
>>
>
>




More information about the mvapich-discuss mailing list