[mvapich-discuss] Some problem with osu_bw when I want to use GPU device buffer

khaled hamidouche hamidouc at cse.ohio-state.edu
Tue Sep 23 06:25:56 EDT 2014


Hi Jacob,

Can you please add MV2_USE_CUDA=1 in you command line and re-try.
For more detail on how to run and tune your application using MV2-GDR,
please refer to the README,

http://mvapich.cse.ohio-state.edu/static/media/mvapich/MV2-GDR-README.txt

Please let us know if this works for you and if you face any issue

Thanks

On Tue, Sep 23, 2014 at 5:07 AM, Zhuangliang <zhuangliang at huawei.com> wrote:

>  To whom it may concern,
>
>
>
> I’m trying to use “mvapich2-gdr” library.
>
>
>
> When I use osu_bw example.
>
> If I use “Send Buffer on HOST (H) and Receive Buffer on HOST (H)”.
> Everything is ok.
>
> (Command line :  mpirun_rsh -np 2 linux-dell RCA61 ./osu_bw -d 'cuda' H H)
>
> Linux-dell and RCA61 are the hosts.
>
>
>
> But if I try to allocator one of the send/recv buffer in GPU. Then some
> errors happened.
>
> (e.g. Command line :  mpirun_rsh -np 2 linux-dell RCA61 ./osu_bw -d 'cuda'
> D H)
>
>
>
> And the error information is as following:
>
> # OSU MPI-CUDA Bandwidth Test
>
> # Send Buffer on DEVICE (D) and Receive Buffer on HOST (H)
>
> # Size        Bandwidth (MB/s)
>
> [linux-dell:mpi_rank_0][error_sighandler] Caught error: Segmentation fault
> (signal 11)
>
> [linux-dell:mpispawn_0][readline] Unexpected End-Of-File on file
> descriptor 6. MPI process died?
>
> [linux-dell:mpispawn_0][mtpmi_processops] Error while reading PMI socket.
> MPI process died?
>
> [linux-dell:mpispawn_0][child_handler] MPI process (rank: 0, pid: 14635)
> terminated with signal 11 -> abort job
>
> [linux-dell:mpirun_rsh][process_mpispawn_connection] mpispawn_0 from node
> RCA61 aborted: Error while reading a PMI socket (4)
>
> [RCA61:mpispawn_1][read_size] Unexpected End-Of-File on file descriptor 6.
> MPI process died?
>
> [RCA61:mpispawn_1][read_size] Unexpected End-Of-File on file descriptor 6.
> MPI process died?
>
> [RCA61:mpispawn_1][handle_mt_peer] Error while reading PMI socket. MPI
> process died?
>
> [RCA61:mpispawn_1][report_error] connect() failed: Connection refused (111)
>
>
>
> It will be appreciated if you can give me some support.
>
>
>
> Thank you very much!
>
>
>
> Jacob
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140923/f168560a/attachment.html>


More information about the mvapich-discuss mailing list