[mvapich-discuss] MPI_Win_flush_all() crashes on shared memory
Hajime Fujita
hfujita at uchicago.edu
Fri Nov 14 13:35:03 EST 2014
Hello,
I have found a crash issue in MVAPICH2-2.1a.
When I launch the attached program (even with one process), it crashes
with SIGSEGV. However if I specify MV2_USE_SHARED_MEM=0, it runs correctly.
I think I encountered a similar problem at the beginning of April this
year. MV2_USE_SHARED_MEM=0 was given from Mingzhe at that moment for a
workaround. However I'm curious if there is a way to fix this
completely, as now I have a simpler reproducer.
MVAPICH version is MVAPICH2-2.1a, also includes a patch given by
Mingzhe. I don't know how to identify the patch, but I believe it
corresponds to this in 2.0.1.
> - Add check for pending operations in one-sided channel in flush_all
Hardware platform:
UChicago RCC Midway
http://rcc.uchicago.edu/resources/midway_specs.html
The following is the log taken on the Midway system. I had one node, one
process allocation, so just typing "mpiexec" implies "mpiexec -n 1".
----
[hfujita at midway070 rma_winflush_test]$ mpiexec
./rma_winflush_testProcess 0: going to issue Win_flush_all()
[midway070:mpi_rank_0][error_sighandler] Caught error: Segmentation
fault (signal 11)
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 9363 RUNNING AT midway070
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault
(signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
[hfujita at midway070 rma_winflush_test]$ MV2_USE_SHARED_MEM=0 mpiexec
./rma_winflush_test
Process 0: going to issue Win_flush_all()
Process 0: after Win_flush_all()
[hfujita at midway070 rma_winflush_test]$ mpichversion
MVAPICH2 Version: 2.1a
MVAPICH2 Release date: Sun Sep 21 12:00:00 EDT 2014
MVAPICH2 Device: ch3:mrail
MVAPICH2 configure:
--prefix=/project/aachien/local/mvapich2-2.1a-gcc-4.8-rma-patch
--enable-shared --no-create --no-recursion
MVAPICH2 CC: gcc -DNDEBUG -DNVALGRIND -O2
MVAPICH2 CXX: g++ -DNDEBUG -DNVALGRIND -O2
MVAPICH2 F77: gfortran -L/lib -L/lib -O2
MVAPICH2 FC: gfortran -O2
----
Thank you,
Hajime
--
Hajime Fujita
Postdoctoral Scholar, Large-Scale Systems Group
Department of Computer Science, The University of Chicago
http://www.cs.uchicago.edu/people/hfujita
-------------- next part --------------
A non-text attachment was scrubbed...
Name: rma_winflush_test.c
Type: text/x-csrc
Size: 1133 bytes
Desc: not available
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20141114/c7514b55/attachment.bin>
More information about the mvapich-discuss
mailing list