[mvapich-discuss] MPI_Cancel bug?

Devendar Bureddy bureddy at cse.ohio-state.edu
Mon Oct 29 12:01:50 EDT 2012


Hi Andrew

Thanks for reporting the issue. It seems test program looks fine.  We will
take a look at the issue and get back to you soon.

-Devendar

On Mon, Oct 29, 2012 at 11:36 AM, Andrew Friedley <friedley3 at llnl.gov>wrote:

> Hi,
>
> I've attached a small program that posts an Irecv, then cancels it.  It
> works correctly for MPICH2 v1.4.1p1 and v1.5 and Open MPI v1.6.2, but
> crashes on both MVAPICH2 1.8 and 1.8.1.  On Open MPI I ran over 150 million
> iterations before killing it; MVAPICH2 crashes consistently on iteration
> 261896.
>
> The output, when run under valgrind, is shown below.  I guess this is a
> bug?  Am I canceling (and testing for cancellation) properly?  Any ideas
> for a workaround?
>
> Thanks,
>
> Andrew
>
>
>
> i 261894
> i 261895
> i 261896
> ==33586== Invalid write of size 4
> ==33586==    at 0x513EF43: MPID_Irecv (in
> /g/g19/friedley/local/mvapich2-1.8-gcc-cab/lib/libmpich.so.3.3)
> ==33586==    by 0x513B0B6: PMPI_Irecv (in
> /g/g19/friedley/local/mvapich2-1.8-gcc-cab/lib/libmpich.so.3.3)
> ==33586==    by 0x400A3F: main (in /g/g19/friedley/svn/afriedle/hmpi2/foo)
> ==33586==  Address 0x4 is not stack'd, malloc'd or (recently) free'd
> ==33586==
> [cab26:mpi_rank_0][error_sighandler] Caught error: Segmentation fault
> (signal 11)
> [cab26:mpi_rank_0][error_sighandler] Caught error: Segmentation fault
> (signal 11)
> [cab26:mpi_rank_0][error_sighandler] Caught error: Segmentation fault
> (signal 11)
>
> (repeat last line MANY times, then this..)
>
> ==33586== Stack overflow in thread 1: can't grow stack to 0x7fe001ec0
> ==33586==
> ==33586== Process terminating with default action of signal 11 (SIGSEGV):
> dumping core
> ==33586==  Access not within mapped region at address 0x7FE001EC0
> ==33586==    at 0x5E96E7B: buffered_vfprintf (vfprintf.c:2255)
> ==33586==  If you believe this happened as a result of a stack
> ==33586==  overflow in your program's main thread (unlikely but
> ==33586==  possible), you can try to increase the size of the
> ==33586==  main thread stack using the --main-stacksize= flag.
> ==33586==  The main thread stack size used in this run was 16777216.
> ==33586== Stack overflow in thread 1: can't grow stack to 0x7fe001eb8
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>


-- 
Devendar
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20121029/2b4d212d/attachment.html


More information about the mvapich-discuss mailing list