[mvapich-discuss] RMA data corruption in 2.0, 2.1a
Hajime Fujita
hfujita at uchicago.edu
Mon Oct 6 14:28:29 EDT 2014
Thanks Hari,
Some additional information that might help:
- On the same machine (Midway), OpenMPI 1.8.1 worked correctly.
- On NERSC Edison (Cray XC30 with Cray MPI 7.0), my test program also
worked correctly.
This is why I suspect this could be an issue in MVAPICH.
Thanks,
Hajime
Hari Subramoni wrote:
> Dear Dr. Fujita,
>
> Thank you for the report. We will take a look at this issue and get back
> to you soon.
>
> Regards,
> Hari.
>
> On Fri, Oct 3, 2014 at 5:22 PM, Hajime Fujita <hfujita at uchicago.edu
> <mailto:hfujita at uchicago.edu>> wrote:
>
> Dear MVAPICH2 team,
>
> We found a potential bug in MVAPICH2 2.0 and 2.1a regarding RMA.
>
> When we run the attached program on two nodes (1 process/node), it
> produces the wrong result. This setting means inter-process
> communications goes over InfiniBand.
>
> # Requesting interactive job with 2 nodes
> $ sinteractive -N 2
> $ mpiexec ./rma_putget_test # launches 2 proc
> loc_buff[1024]=-1 != 1024
>
> If it works correctly, there would be no output.
>
> If we run this on a single machine with multiple processes, it runs
> correctly.
> If I'm using MPI RMA functions in some incorrect way please let me know.
>
>
> Hardware platform:
> UChicago RCC Midway
> http://rcc.uchicago.edu/resources/midway_specs.html
>
> MVAPICH versions and configurations:
>
> [hfujita at midway-login1 ~]$ mpichversion
> MVAPICH2 Version: 2.1a
> MVAPICH2 Release date: Sun Sep 21 12:00:00 EDT 2014
> MVAPICH2 Device: ch3:mrail
> MVAPICH2 configure:
> --prefix=/project/aachien/local/mvapich2-2.1a-gcc-4.8 --enable-shared
> MVAPICH2 CC: gcc -DNDEBUG -DNVALGRIND -O2
> MVAPICH2 CXX: g++ -DNDEBUG -DNVALGRIND -O2
> MVAPICH2 F77: gfortran -L/lib -L/lib -O2
> MVAPICH2 FC: gfortran -O2
>
> [hfujita at midway-login1 ~]$ mpichversion
> MVAPICH2 Version: 2.0
> MVAPICH2 Release date: Fri Jun 20 20:00:00 EDT 2014
> MVAPICH2 Device: ch3:mrail
> MVAPICH2 configure: --prefix=/software/mvapich2-2.0-el6-x86_64
> --enable-shared
> MVAPICH2 CC: gcc -DNDEBUG -DNVALGRIND -O2
> MVAPICH2 CXX: g++ -DNDEBUG -DNVALGRIND
> MVAPICH2 F77: gfortran -L/lib -L/lib -O2
> MVAPICH2 FC: gfortran
>
>
> Thank you,
> Hajime
>
> --
> Hajime Fujita
> Postdoctoral Scholar, Large-Scale Systems Group
> Department of Computer Science, The University of Chicago
> http://www.cs.uchicago.edu/people/hfujita
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> <mailto:mvapich-discuss at cse.ohio-state.edu>
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
More information about the mvapich-discuss
mailing list