[mvapich-discuss] application failing with process_vm_readv fail [was: Re: osu_mbw_mr does not recognize option -d]

Marius Brehler marius.brehler at tu-dortmund.de
Wed Jun 27 08:33:32 EDT 2018


Hi Hari,

On 06/27/2018 01:43 PM, Subramoni, Hari wrote:
> Hi, Marius.
>
> I apologize for the error. We will correct that ASAP. I am sure that osu_mbw_mr does not support device to device transfers. Could you please try osu_bw, osu_bibw or osu_latency instead?

no problem. The intention was just to let you know :)
osu_bw and osu_latency seem to work fine and also recognize '-d cuda'.


> Could you please let me know what sort of errors you're facing with running your application with our stacks? Were you using MVAPICH2 or MVAPICH2-GDR? Did you use the LD_PRELOAD option?

We have a GPU-accelerated implementation of an Runge-Kutta in the
Interaction Picture (RK4IP) algorithm, which now additionally uses MPI
to support multi-GPU. If you are interested in more details regarding
the original implementation you might take a look at [1].
In a first step I would like to test GPU-to-GPU communication using
MVAPICH-GDR 2.3a on a single node (equipped with two GPUs).
Only in a second step, one more machine (both connected with ConnectX-4
HCAs) will be involved. Currently, the applications fails even on the
single node.

Our prior implementation used MPI_Bcast and is failing with:

[cli_1]: aborting job:
Fatal error in PMPI_Bcast:
Other MPI error, error stack:
PMPI_Bcast(1635)......................: MPI_Bcast(buf=0xb04760000,
count=8192, MPI_DOUBLE, root=0, MPI_COMM_WORLD) failed
MPIR_Bcast_impl(1471).................:
MPIR_Bcast_MV2(3928)..................:
MPIR_Bcast_index_tuned_intra_MV2(3573):
MPIR_Bcast_binomial_MV2(163)..........:
MPIC_Recv(439)........................:
MPIC_Wait(323)........................:
MPIDI_CH3I_Progress(215)..............:
MPIDI_CH3I_SMP_read_progress(1247)....:
MPIDI_CH3I_SMP_readv_rndv(4882).......: CMA: (MPIDI_CH3I_SMP_readv_rndv)
process_vm_readv fail



Our new implementation uses MPI_I{send,recv} and fails with:

[cli_0]: aborting job:
Fatal error in PMPI_Test:
Other MPI error, error stack:
PMPI_Test(168)....................: MPI_Test(request=0x7f4468d3d310,
flag=0x7f4468d3d330, status=0x1) failed
MPIDI_CH3I_Progress_test(503).....:
MPIDI_CH3I_SMP_read_progress(1247):
MPIDI_CH3I_SMP_readv_rndv(4882)...: CMA: (MPIDI_CH3I_SMP_readv_rndv)
process_vm_readv fail

[cli_1]: aborting job:
Fatal error in PMPI_Test:
Other MPI error, error stack:
PMPI_Test(168)....................: MPI_Test(request=0x7fdb60138d90,
flag=0x7fdb60138db0, status=0x1) failed
MPIDI_CH3I_Progress_test(503).....:
MPIDI_CH3I_SMP_read_progress(1247):
MPIDI_CH3I_SMP_readv_rndv(4882)...: CMA: (MPIDI_CH3I_SMP_readv_rndv)
process_vm_readv fail


I use environment modules on the CentOS machine, environment variables
are set as following:

[brehler at fermi tests]$ env|grep MV2
MV2_PATH=/opt/mvapich2/gdr/2.3a
MV2_USE_GPUDIRECT_GDRCOPY=1
MV2_USE_CUDA=1
MV2_GPUDIRECT_GDRCOPY_LIB=/opt/gdrcopy/lib64/libgdrapi.so
MV2_USE_GPUDIRECT=1

> http://mvapich.cse.ohio-state.edu/userguide/gdr/2.3a/#_example_use_of_ld_preload

Setting LD_PRELOAD fails with

[proxy:0:0 at fermi.hft.e-technik.tu-dortmund.de] HYDU_create_process
(utils/launch/launch.c:75): execvp error on file
LD_PRELOAD=/opt/mvapich2/gdr/2.3a/lib64/libmpi.so.12.0.5 (No such file
or directory)

Actually the lib is present at the location

[brehler at fermi ~]$ ls -la /opt/mvapich2/gdr/2.3a/lib64/libmpi.so.12.0.5
-rwxr-xr-x. 1 root root 6592880 Jun  5 20:24
/opt/mvapich2/gdr/2.3a/lib64/libmpi.so.12.0.5


Our issue might not be directly related with mvapich-gdr and might be a
configuration issue. Therefore, I appreciate any hint.
Best Regards

Marius



[1]
https://www.nvidia.com/content/dam/en-zz/Solutions/gtc-europe/posters/high-performance-computing/gtc-eu-europe-research-posters-11.jpg

> Best Regards,
> Hari.
>
> -----Original Message-----
> From: mvapich-discuss-bounces at cse.ohio-state.edu On Behalf Of Marius Brehler
> Sent: Wednesday, June 27, 2018 1:15 PM
> To: Subramoni, Hari <subramoni.1 at osu.edu>
> Cc: mvapich-discuss at cse.ohio-state.edu <mvapich-discuss at mailman.cse.ohio-state.edu>
> Subject: Re: [mvapich-discuss] osu_mbw_mr does not recognize option -d
>
> Hello Harri,
>
> actually I have chosen this benchmark randomly to verify our system setup, since our application is miserably failing whereas openmpi works.
> Both processes were executed on the same node, just picked osu_mbw_mr since the example is given in the docs:
>
> http://mvapich.cse.ohio-state.edu/userguide/gdr/2.3a/#_examples_using_osu_micro_benchmarks_with_multi_rail_support
>
> In this example the '-d cuda' option is passed. osu_mbw_mr reports
> 'pairs: 1' and both GPUs have some load, showing that a process is running on each of them with nvidia-smi.
>
> Regards
>
> Marius
>
> On 06/27/2018 12:51 PM, Subramoni, Hari wrote:
>> Hello, Marius.
>>
>> The multi-pair bandwidth/message rate benchmark does not support
>> transfer from/to device buffers. This is a known limitation. We have
>> plans to add this support in the future.
>>
>> Could you please take a look at the numbers reported with
>> '--accelerator cuda' and compare it with numbers seen for host to
>> host? It's likely that it is just running host to host and not
>> throwing the error message correctly.
>>
>> Regards,
>> Hari.
>>
>> Sent from my phone
>>
>>
>> On Jun 27, 2018 12:34 PM, Marius Brehler
>> <marius.brehler at tu-dortmund.de>
>> wrote:
>>
>>     Hi,
>>
>>     testing mvapich2-gdr on our machine, I noticed that passing '-d cuda' to
>>     the osu_mbw_mr benchmark (shipped with the RPM) fails:
>>
>>
>>     [brehler at fermi tests]$ mpirun -np 2
>>     /opt/mvapich2/gdr/2.3a/libexec/osu-micro-benchmarks/get_local_rank
>>     /opt/mvapich2/gdr/2.3a/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_mbw_mr
>>     -d cuda
>>     /opt/mvapich2/gdr/2.3a/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_mbw_mr:
>>     invalid option -- 'd'
>>     Invalid Option [-d]
>>
>>     /opt/mvapich2/gdr/2.3a/libexec/osu-micro-benchmarks/mpi/pt2pt/osu_mbw_mr:
>>     invalid option -- 'd'
>>     Usage: (null) [options]
>>
>>
>>     Passing the option '--accelerator cuda' works as expected.
>>     Regards
>>
>>     Marius
>>     [..]
Wichtiger Hinweis: Die Information in dieser E-Mail ist vertraulich. Sie ist ausschließlich für den Adressaten bestimmt. Sollten Sie nicht der für diese E-Mail bestimmte Adressat sein, unterrichten Sie bitte den Absender und vernichten Sie diese Mail. Vielen Dank.
Unbeschadet der Korrespondenz per E-Mail, sind unsere Erklärungen ausschließlich final rechtsverbindlich, wenn sie in herkömmlicher Schriftform (mit eigenhändiger Unterschrift) oder durch Übermittlung eines solchen Schriftstücks per Telefax erfolgen.

Important note: The information included in this e-mail is confidential. It is solely intended for the recipient. If you are not the intended recipient of this e-mail please contact the sender and delete this message. Thank you. Without prejudice of e-mail correspondence, our statements are only legally binding when they are made in the conventional written form (with personal signature) or when such documents are sent by fax.



More information about the mvapich-discuss mailing list