[mvapich-discuss] MPI_Win_flush_all() crashes on shared memory

Hajime Fujita hfujita at uchicago.edu
Mon Nov 17 12:54:30 EST 2014


Hi Mingzhe,

Thanks! The patch worked for 2.1a.


Best,
Hajime

Mingzhe Li wrote:
> Hi Hajime,
>
> Thanks for your note. We saw a similar issue in our internal testing
> framework and we already had a fix for this issue. The fix has been
> applied to 2.0.1 release. It will also be available in the 2.1b release
> during the next few days. If you want to keep using 2.1a, could you
> please apply the attached patch to your local 2.1a codebase?
>
> $cd /path/to/2.1a
> $patch -p1 < /path/to/flush-all
>
> Thanks,
> Mingzhe
>
> On Fri, Nov 14, 2014 at 1:35 PM, Hajime Fujita <hfujita at uchicago.edu
> <mailto:hfujita at uchicago.edu>> wrote:
>
>     Hello,
>
>     I have found a crash issue in MVAPICH2-2.1a.
>
>     When I launch the attached program (even with one process), it
>     crashes with SIGSEGV. However if I specify MV2_USE_SHARED_MEM=0, it
>     runs correctly.
>
>     I think I encountered a similar problem at the beginning of April
>     this year. MV2_USE_SHARED_MEM=0 was given from Mingzhe at that
>     moment for a workaround. However I'm curious if there is a way to
>     fix this completely, as now I have a simpler reproducer.
>
>     MVAPICH version is MVAPICH2-2.1a, also includes a patch given by
>     Mingzhe. I don't know how to identify the patch, but I believe it
>     corresponds to this in 2.0.1.
>      >    - Add check for pending operations in one-sided channel in
>     flush_all
>
>     Hardware platform:
>        UChicago RCC Midway
>     http://rcc.uchicago.edu/__resources/midway_specs.html
>     <http://rcc.uchicago.edu/resources/midway_specs.html>
>
>
>     The following is the log taken on the Midway system. I had one node,
>     one process allocation, so just typing "mpiexec" implies "mpiexec -n 1".
>     ----
>
>     [hfujita at midway070 rma_winflush_test]$ mpiexec
>     ./rma_winflush_testProcess 0: going to issue Win_flush_all()
>     [midway070:mpi_rank_0][error___sighandler] Caught error:
>     Segmentation fault (signal 11)
>
>     ==============================__==============================__=======================
>     =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>     =   PID 9363 RUNNING AT midway070
>     =   EXIT CODE: 11
>     =   CLEANING UP REMAINING PROCESSES
>     =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>     ==============================__==============================__=======================
>     YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault
>     (signal 11)
>     This typically refers to a problem with your application.
>     Please see the FAQ page for debugging suggestions
>     [hfujita at midway070 rma_winflush_test]$ MV2_USE_SHARED_MEM=0 mpiexec
>     ./rma_winflush_test
>     Process 0: going to issue Win_flush_all()
>     Process 0: after Win_flush_all()
>
>
>
>     [hfujita at midway070 rma_winflush_test]$ mpichversion
>     MVAPICH2 Version:       2.1a
>     MVAPICH2 Release date:  Sun Sep 21 12:00:00 EDT 2014
>     MVAPICH2 Device:        ch3:mrail
>     MVAPICH2 configure:
>     --prefix=/project/aachien/__local/mvapich2-2.1a-gcc-4.8-__rma-patch
>     --enable-shared --no-create --no-recursion
>     MVAPICH2 CC:    gcc    -DNDEBUG -DNVALGRIND -O2
>     MVAPICH2 CXX:   g++   -DNDEBUG -DNVALGRIND -O2
>     MVAPICH2 F77:   gfortran -L/lib -L/lib   -O2
>     MVAPICH2 FC:    gfortran   -O2
>     ----
>
>
>     Thank you,
>     Hajime
>
>     --
>     Hajime Fujita
>     Postdoctoral Scholar, Large-Scale Systems Group
>     Department of Computer Science, The University of Chicago
>     http://www.cs.uchicago.edu/__people/hfujita
>     <http://www.cs.uchicago.edu/people/hfujita>
>
>     _______________________________________________
>     mvapich-discuss mailing list
>     mvapich-discuss at cse.ohio-state.edu
>     <mailto:mvapich-discuss at cse.ohio-state.edu>
>     http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>



More information about the mvapich-discuss mailing list