[mvapich-discuss] SIGSEV in F90: An MPI bug?

Jeff Squyres jsquyres at cisco.com
Thu Jan 31 14:53:48 EST 2008


On Jan 31, 2008, at 1:32 PM, David Stuebe wrote:

> Maybe I don't fully understand all the issues involved but I did  
> read through several web sites that discuss the dangers of passing  
> temporary arrays to non blocking MPI calls. Is MPI_BCAST non- 
> blocking - I assumed that was a blocking call anyway?

Yes it is; my bad for not noticing that that is what you were asking  
about.  :-)

> Again, my concern is that MPI call returns the data on all  
> processors as (perhaps, naively) expected, it is later in the  
> program that an alloc called on entry to a different subroutine for  
> an explicit shape array causes a sig sev. There is further evidence  
> that it is an MPI issue because the problem is memory-size  
> dependent, and only occurs when run using more than one node, using  
> mvapich2.0. MPICH2.0 when I tested that on our cluster which does  
> not have infiniband.

Looking at your example code, I don't understand all of the F90 syntax  
to fully appreciate what's going on.  It looks like you're passing an  
array subset to MPI_BCAST and when that is a non-contiguous buffer,  
problems *may* occur later.  Is that what you're trying to say?

My *guesses/speculation* are:

- perhaps there's an issue with compiler-provided temporary buffers  
that are registered by MVAPICH and then later freed by the compiler,  
but somehow evade being unregistered by MVAPICH (this could lead to  
heap corruption that manifests later)

- I don't know if fortran compilers are allowed to move buffers at  
will, such as in garbage collection and/or memory compacting schemes  
(do you?) -- this could lead to a similar problem that I describe above

Again, these are pure speculation.  I really don't know how F90  
compilers work, and I don't know what MVAPICH does with registered  
memory caching and/or progress threads, so further speculation is  
fairly pointless.  :-)

MVAPICH developers: can you comment on this?

And just to be sure -- you compiled MVAPICH with the same compilers  
that you're using, with the same levels of optimization, etc., right?

> On Jan 31, 2008 1:06 PM, Jeff Squyres <jsquyres at cisco.com> wrote:
> Brian is completely correct - if the F90 compiler chooses to make
> temporary buffers in order to pass array subsections to non-blocking
> MPI functions, there's little that an MPI implementation can do.
> Simply put: MPI requires that when you use non-blocking
> communications, the buffer must be available until you call some
> flavor of MPI_TEST or MPI_WAIT to complete the communication.
>
> I don't know of any way for an MPI implementation to know whether it
> has been handed a temporary buffer (e.g., one that a compiler silently
> created to pass an array subsection).  Do you know if there is a way?
>
>
>
> On Jan 31, 2008, at 12:36 PM, Brian Curtis wrote:
>
> > David,
> >
> > The MPI-2 documentation goes into great detail on issues with
> > Fortran-90 bindings (http://www.mpi-forum.org/docs/mpi-20-html/node236.htm#Node236
> > ).  The conditions you are seeing should be directed to Intel.
> >
> >
> > Brian
> >
> >
> > On Jan 31, 2008, at 11:59 AM, David Stuebe wrote:
> >
> >>
> >> Hi again Brian
> >>
> >> I just ran my test code on our cluster using ifort 10.1.011 and
> >> MVAPICH 1.0.1, but the behavior is still the same.
> >>
> >> Have you had a chance to try it on any of your test machines?
> >>
> >> David
> >>
> >>
> >>
> >>
> >> On Jan 25, 2008 12:31 PM, Brian Curtis <curtisbr at cse.ohio-
> >> state.edu> wrote:
> >> David,
> >>
> >> I did some research on this issue and it looks like you have posted
> >> the
> >> bug with Intel.  Please let us know what you find out.
> >>
> >>
> >> Brian
> >>
> >> David Stuebe wrote:
> >> > Hi Brian
> >> >
> >> > I downloaded the public release, it seems silly but I am not sure
> >> how to get
> >> > a rev number from the source... there does not seem to be a '-
> >> version'
> >> > option that gives more info, although I did not look to hard.
> >> >
> >> > I have not tried MVAPICH 1.0.1, but once I have intel ifort 10 on
> >> the
> >> > cluster I will try 1.0.1 and see if it goes away.
> >> >
> >> > In the mean time please let me know if you can recreate the
> >> problem?
> >> >
> >> > David
> >> >
> >> > PS - Just want to make sure you understand my issue, I think it
> >> is a bad
> >> > idea to try and pass a non-contiguous F90 memory pointer, I
> >> should not do
> >> > that... but the way that it breaks has caused me headaches for
> >> weeks now. If
> >> > it reliably caused a sigsev on entering MPI_BCAST that would be
> >> great! As it
> >> > is it is really hard to trace the problem.
> >> >
> >> >
> >> >
> >> >
> >> > On Jan 23, 2008 3:23 PM, Brian Curtis <curtisbr at cse.ohio-
> >> state.edu> wrote:
> >> >
> >> >
> >> >> David,
> >> >>
> >> >> Sorry to hear you are experience problems with the MVAPICH2
> >> Fortran 90
> >> >> interface.  I will be investigating this issue, but need some
> >> additional
> >> >> information about your setup.  What is the exact version of
> >> MVAPICH2 1.0
> >> >> you are utilizing (daily tarball or release)?  Have you tried
> >> MVAPICH2
> >> >> 1.0.1?
> >> >>
> >> >> Brian
> >> >>
> >> >> David Stuebe wrote:
> >> >>
> >> >>> Hello MVAPICH
> >> >>> I have found a strange bug in MVAPICH2 using IFORT. The
> >> behavior is very
> >> >>> strange indeed - it seems to be related to how ifort deals with
> >> passing
> >> >>> pointers to the MVAPICH FORTRAN 90 INTERFACE.
> >> >>> The MPI call returns successfully, but later calls to a dummy
> >> subroutine
> >> >>> cause a sigsev.
> >> >>>
> >> >>>  Please look at the following code:
> >> >>>
> >> >>>
> >> >>>
> >> >> !
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >>  
> =====================================================================
> >> >>
> >> >> !
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >>  
> =====================================================================
> >> >>
> >> >> !
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >>  
> =====================================================================
> >> >>
> >> >>> ! TEST CODE TO FOR POSSIBLE BUG IN MVAPICH2 COMPILED ON IFORT
> >> >>> ! WRITEN BY: DAVID STUEBE
> >> >>> ! DATE: JAN 23, 2008
> >> >>> !
> >> >>> ! COMPILE WITH: mpif90 -xP mpi_prog.f90 -o xtest
> >> >>> !
> >> >>> ! KNOWN BEHAVIOR:
> >> >>> ! PASSING A NONE CONTIGUOUS POINTER TO MPI_BCAST CAUSES FAILURE
> >> OF
> >> >>> ! SUBROUTINES USING MULTI DIMENSIONAL EXPLICT SHAPE ARRAYS
> >> WITHOUT AN
> >> >>> INTERFACE -
> >> >>> ! EVEN THOUGH THE MPI_BCAST COMPLETES SUCCESUFULLY, RETURNING
> >> VALID
> >> >>>
> >> >> DATA.
> >> >>
> >> >>> !
> >> >>> ! COMMENTS:
> >> >>> ! I REALIZE PASSING NON CONTIGUOUS POINTERS IS DANGEROUS -
> >> SHAME ON
> >> >>> ! ME FOR MAKING THAT MISTAKE. HOWEVER, IT SHOULD EITHER WORK OR
> >> NOT.
> >> >>> ! RETURNING SUCCESSFULLY BUT CAUSING INTERFACE ERRORS LATER IS
> >> >>> ! EXTREMELY DIFFICULT TO DEBUG!
> >> >>> !
> >> >>> ! CONDITIONS FOR OCCURANCE:
> >> >>> !    COMPILER MUST OPTIMIZE USING 'VECTORIZATION'
> >> >>> !    ARRAY MUST BE 'LARGE' -SYSTEM DEPENDENT ?
> >> >>> !    MUST BE RUN ON MORE THAN ONE NODE TO CAUSE CRASH...
> >> >>> !    ie  Running inside one SMP box does not crash.
> >> >>> !
> >> >>> !    RUNNING UNDER MPD, ALL PROCESSES SIGSEV
> >> >>> !    RUNNING UNDER MPIEXEC0.82 FOR PBS,
> >> >>> !       ONLY SOME PROCESSES SIGSEV ?
> >> >>> !
> >> >>> ! ENVIRONMENTAL INFO:
> >> >>> ! NODES: DELL 1850 3.0GHZ, 2GB RAM, INFINIBAND PCI-EX 4X
> >> >>> ! SYSTEM: ROCKS 4.2
> >> >>> ! gcc version 3.4.6 20060404 (Red Hat 3.4.6-3)
> >> >>> !
> >> >>> ! IFORT/ICC:
> >> >>> !   Intel(R) Fortran Compiler for Intel(R) EM64T-based
> >> applications,
> >> >>> !   Version 9.1 Build 20061101 Package ID: l_fc_c_9.1.040
> >> >>> !
> >> >>> ! MVAPICH2: mpif90 for mvapich2-1.0
> >> >>> ! ./configure --prefix=/usr/local/share/mvapich2/1.0
> >> >>> --with-device=osu_ch3:mrail --with-rdma=vapi --with-pm=mpd --
> >> enable-f90
> >> >>> --enable-cxx --disable-romio --without-mpe
> >> >>> !
>
> >>  
> =====================================================================
> >> >>
> >> >>> Module vars
> >> >>>   USE MPI
> >> >>>   implicit none
> >> >>>
> >> >>>
> >> >>>   integer :: n,m,MYID,NPROCS
> >> >>>   integer :: ipt
> >> >>>
> >> >>>   integer, allocatable, target :: data(:,:)
> >> >>>
> >> >>>   contains
> >> >>>
> >> >>>     subroutine alloc_vars
> >> >>>       implicit none
> >> >>>
> >> >>>       integer Status
> >> >>>
> >> >>>       allocate(data(n,m),stat=status)
> >> >>>       if (status /=0) then
> >> >>>          write(ipt,*) "allocation error"
> >> >>>          stop
> >> >>>       end if
> >> >>>
> >> >>>       data = 0
> >> >>>
> >> >>>     end subroutine alloc_vars
> >> >>>
> >> >>>    SUBROUTINE INIT_MPI_ENV(ID,NP)
> >> >>>
> >> >>>
> >> >> !
>
> >>  
> ====================================================================|
> >> >>
> >> >>> !  INITIALIZE MPI
> >> >>>
> >> ENVIRONMENT                                                       |
> >> >>>
> >> >>>
> >> >> !
>
> >>  
> ====================================================================|
> >> >>
> >> >>>      INTEGER, INTENT(OUT) :: ID,NP
> >> >>>      INTEGER IERR
> >> >>>
> >> >>>      IERR=0
> >> >>>
> >> >>>      CALL MPI_INIT(IERR)
> >> >>>      IF(IERR/=0) WRITE(*,*) "BAD MPI_INIT", ID
> >> >>>      CALL MPI_COMM_RANK(MPI_COMM_WORLD,ID,IERR)
> >> >>>      IF(IERR/=0) WRITE(*,*) "BAD MPI_COMM_RANK", ID
> >> >>>      CALL MPI_COMM_SIZE(MPI_COMM_WORLD,NP,IERR)
> >> >>>      IF(IERR/=0) WRITE(*,*) "BAD MPI_COMM_SIZE", ID
> >> >>>
> >> >>>    END SUBROUTINE INIT_MPI_ENV
> >> >>>
> >> >>>
> >> >>>
> >> >>>
> >> >> !
> =============================|
> >> >>
> >> >>>   SUBROUTINE PSHUTDOWN
> >> >>>
> >> >>>
> >> >>>
> >> >> !
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >> =
> >>  
> ====================================================================|
> >> >>
> >> >>>     INTEGER IERR
> >> >>>
> >> >>>     IERR=0
> >> >>>     CALL MPI_FINALIZE(IERR)
> >> >>>     if(ierr /=0) write(ipt,*) "BAD MPI_FINALIZE", MYID
> >> >>>     close(IPT)
> >> >>>     STOP
> >> >>>
> >> >>>   END SUBROUTINE PSHUTDOWN
> >> >>>
> >> >>>
> >> >>>   SUBROUTINE CONTIGUOUS_WORKS
> >> >>>     IMPLICIT NONE
> >> >>>     INTEGER, pointer :: ptest(:,:)
> >> >>>     INTEGER :: IERR, I,J
> >> >>>
> >> >>>
> >> >>>     write(ipt,*) "START CONTIGUOUS:"
> >> >>>     n=2000 ! Set size here...
> >> >>>     m=n+10
> >> >>>
> >> >>>     call alloc_vars
> >> >>>     write(ipt,*) "ALLOCATED DATA"
> >> >>>     ptest => data(1:N,1:N)
> >> >>>
> >> >>>     IF (MYID == 0) ptest=6
> >> >>>     write(ipt,*) "Made POINTER"
> >> >>>
> >> >>>     call MPI_BCAST(ptest,N*N,MPI_INTEGER,0,MPI_COMM_WORLD,IERR)
> >> >>>     IF(IERR /= 0) WRITE(IPT,*) "BAD BCAST", MYID
> >> >>>
> >> >>>     write(ipt,*) "BROADCAST Data; a value:",data(1,6)
> >> >>>
> >> >>>     DO I = 1,N
> >> >>>        DO J = 1,N
> >> >>>           if(data(I,J) /= 6) &
> >> >>>                & write(ipt,*) "INCORRECT VALUE!", I,J,data(I,J)
> >> >>>        END DO
> >> >>>
> >> >>>        DO J = N+1,M
> >> >>>           if(data(I,J) /= 0) &
> >> >>>                & write(ipt,*) "INCORRECT VALUE!", I,J,data(I,J)
> >> >>>        END DO
> >> >>>
> >> >>>     END DO
> >> >>>
> >> >>>     ! CALL THREE DIFFERENT EXAMPLES OF SUBROUTINES W/OUT AN
> >> ITERFACE
> >> >>>     ! THAT USE AN EXPLICIT SHAPE ARRAY
> >> >>>     write(ipt,*) "CALLING DUMMY1"
> >> >>>     CALL DUMMY1
> >> >>>
> >> >>>     write(ipt,*) "CALLING DUMMY2"
> >> >>>     call Dummy2(m,n)
> >> >>>
> >> >>>     write(ipt,*) "CALLING DUMMY3"
> >> >>>     call Dummy3
> >> >>>     write(ipt,*) "FINISHED!"
> >> >>>
> >> >>>   END SUBROUTINE CONTIGUOUS_WORKS
> >> >>>
> >> >>>   SUBROUTINE NON_CONTIGUOUS_FAILS
> >> >>>     IMPLICIT NONE
> >> >>>     INTEGER, pointer :: ptest(:,:)
> >> >>>     INTEGER :: IERR, I,J
> >> >>>
> >> >>>
> >> >>>     write(ipt,*) "START NON_CONTIGUOUS:"
> >> >>>
> >> >>>     m=200 ! Set size here - crash is size dependent!
> >> >>>     n=m+10
> >> >>>
> >> >>>     call alloc_vars
> >> >>>     write(ipt,*) "ALLOCATED DATA"
> >> >>>     ptest => data(1:M,1:M)
> >> >>>
> >> >>> !===================================================
> >> >>> ! IF YOU CALL DUMMY2 HERE TOO, THEN EVERYTHING PASSES  ???
> >> >>> !===================================================
> >> >>> !    CALL DUMMY1 ! THIS ONE HAS NO EFFECT
> >> >>> !    CALL DUMMY2 ! THIS ONE 'FIXES' THE BUG
> >> >>>
> >> >>>     IF (MYID == 0) ptest=6
> >> >>>     write(ipt,*) "Made POINTER"
> >> >>>
> >> >>>     call MPI_BCAST(ptest,M*M,MPI_INTEGER,0,MPI_COMM_WORLD,IERR)
> >> >>>     IF(IERR /= 0) WRITE(IPT,*) "BAD BCAST"
> >> >>>
> >> >>>     write(ipt,*) "BROADCAST Data; a value:",data(1,6)
> >> >>>
> >> >>>     DO I = 1,M
> >> >>>        DO J = 1,M
> >> >>>           if(data(J,I) /= 6) &
> >> >>>                & write(ipt,*) "INCORRECT VALUE!",I,J,DATA(I,J)
> >> >>>        END DO
> >> >>>
> >> >>>        DO J = M+1,N
> >> >>>           if(data(J,I) /= 0) &
> >> >>>                & write(ipt,*) "INCORRECT VALUE!",I,J,DATA(I,J)
> >> >>>        END DO
> >> >>>     END DO
> >> >>>
> >> >>>     ! CALL THREE DIFFERENT EXAMPLES OF SUBROUTINES W/OUT AN
> >> ITERFACE
> >> >>>     ! THAT USE AN EXPLICIT SHAPE ARRAY
> >> >>>     write(ipt,*) "CALLING DUMMY1"
> >> >>>     CALL DUMMY1
> >> >>>
> >> >>>     write(ipt,*) "CALLING DUMMY2"
> >> >>>     call Dummy2(m,n) ! SHOULD CRASH HERE!
> >> >>>
> >> >>>     write(ipt,*) "CALLING DUMMY3"
> >> >>>     call Dummy3
> >> >>>     write(ipt,*) "FINISHED!"
> >> >>>
> >> >>>   END SUBROUTINE NON_CONTIGUOUS_FAILS
> >> >>>
> >> >>>
> >> >>>   End Module vars
> >> >>>
> >> >>>
> >> >>> Program main
> >> >>>   USE vars
> >> >>>   implicit none
> >> >>>
> >> >>>
> >> >>>   CALL INIT_MPI_ENV(MYID,NPROCS)
> >> >>>
> >> >>>   ipt=myid+10
> >> >>>   OPEN(ipt)
> >> >>>
> >> >>>
> >> >>>   write(ipt,*) "Start memory test!"
> >> >>>
> >> >>>   CALL NON_CONTIGUOUS_FAILS
> >> >>>
> >> >>> !  CALL CONTIGUOUS_WORKS
> >> >>>
> >> >>>   write(ipt,*) "End memory test!"
> >> >>>
> >> >>>   CALL PSHUTDOWN
> >> >>>
> >> >>> END Program main
> >> >>>
> >> >>>
> >> >>>
> >> >>> ! TWO DUMMY SUBROUTINE WITH EXPLICIT SHAPE ARRAYS
> >> >>> ! DUMMY1 DECLARES A VECTOR  - THIS ONE NEVER CAUSES FAILURE
> >> >>> ! DUMMY2 DECLARES AN ARRAY  - THIS ONE CAUSES FAILURE
> >> >>>
> >> >>> SUBROUTINE DUMMY1
> >> >>>   USE vars
> >> >>>   implicit none
> >> >>>   real, dimension(m) :: my_data
> >> >>>
> >> >>>   write(ipt,*) "m,n",m,n
> >> >>>
> >> >>>   write(ipt,*) "DUMMY 1", size(my_data)
> >> >>>
> >> >>> END SUBROUTINE DUMMY1
> >> >>>
> >> >>>
> >> >>> SUBROUTINE DUMMY2(i,j)
> >> >>>   USE vars
> >> >>>   implicit none
> >> >>>   INTEGER, INTENT(IN) ::i,j
> >> >>>
> >> >>>
> >> >>>   real, dimension(i,j) :: my_data
> >> >>>
> >> >>>   write(ipt,*) "start: DUMMY 2", size(my_data)
> >> >>>
> >> >>>
> >> >>> END SUBROUTINE DUMMY2
> >> >>>
> >> >>> SUBROUTINE DUMMY3
> >> >>>   USE vars
> >> >>>   implicit none
> >> >>>
> >> >>>
> >> >>>   real, dimension(m,n) :: my_data
> >> >>>
> >> >>>
> >> >>>   write(ipt,*) "start: DUMMY 3", size(my_data)
> >> >>>
> >> >>>
> >> >>> END SUBROUTINE DUMMY3


-- 
Jeff Squyres
Cisco Systems



More information about the mvapich-discuss mailing list