[mvapich-discuss] WG: Bug in MPI_WIN_SHARED_QUERY when called from Fortran95-code

Mingzhe Li li.2192 at osu.edu
Mon Oct 20 20:33:20 EDT 2014


Just for everybody's information. We had offline discussion with the user
about this issue. Two patches have been sent to the user and the issue has
been resolved. These patches will be available in the future release of
MVAPICH2. We are closing the thread.

Thanks,
Mingzhe

On Thu, Jun 5, 2014 at 10:30 AM, <Michael.Rachner at dlr.de> wrote:

>  Dear Mr. Panda,
>
>
>
> In the meantime I could try also the MVAPICH 2.0rc2 version on the Linux
> Cluster  Laki .
>
> *However I found that there is still a bug remaining in the shared memory
> allocation*:
>
>    If I allocate successively not more than n=8 arrays (see the code), all
> works fine now,
>
>    but when allocating the 9-th arrays then always  a segmentation fault
> occurs in sbr MPI_WIN_ALLOCATE_SHARED on all processes!
>
>    This error occurs  independent of the size of the allocated array and
> independent of the no. of processes running.
>
>
>
> For your convenience, I have  added a ‘downsized’ example Ftn-code to
> demonstrate the behavior.
>
>
>
> Greetings
>
> Michael Rachner
>
>
>
>
>
> ************************ This the Ftn95-code (contents of file
> sharedmemtest.f90) to demonstrate the bug in   MVAPICH2 2.0rc2  :
> ***********
>
> !
>
>     *module* MYMPI
>
> !%%%      use MPI
>
>       *include* "mpif.h"
>
>       *integer* ::  myrankWORLD, numprocsWORLD
>
>     *end* *module* MYMPI
>
> !
>
> !
>
>     *program* sharedmemtest
>
> !     This code demonstrates a bug with MVAPICH2-2.0rc2 in the MPI-3
> shared memory alloc. of a 1d-integer-array.
>
> !     This code was compiled with INTEL-14.0.2:   mpif90 -O0 -debug
> -traceback -check -fpe0 sharedmemtest.f90
>
> !     This code was launched with:                mpiexec -np 2 -bind-to
> core -prepend-rank ./a.out
>
> !
>
>       *use* MYMPI, *only*:  MPI_COMM_WORLD, MPI_MAX_LIBRARY_VERSION_STRING
> &
>
>                        ,myrankWORLD, numprocsWORLD
>
>       *implicit* *none*
>
>       *character*(*len*=MPI_MAX_LIBRARY_VERSION_STRING) ::  versionstring
>
>       *integer*                                       ::  iresultlen,
> idim_1, i, n, ierr_mpi
>
> !
>
> !   --initialize MPI:
>
>       *call* MPI_INIT( ierr_mpi )
>
>       *call* MPI_COMM_RANK( MPI_COMM_WORLD, myrankWORLD  , ierr_mpi )
>
>       *call* MPI_COMM_SIZE( MPI_COMM_WORLD, numprocsWORLD, ierr_mpi )
>
>           *print* *,'=== ftn95-program sharedmemtest has been entered by
> process no.: ',myrankWORLD
>
>       *call* MPI_BARRIER( MPI_COMM_WORLD, ierr_mpi )
>
> !
>
>       *call* MPI_GET_LIBRARY_VERSION( versionstring, iresultlen, ierr_mpi
> )
>
>       *if*(myrankWORLD == 0) *then*
>
>           *write*(6,*)    'Version of MPI library used in this run:'
>
>           *write*(6,'(a)') versionstring(1:iresultlen)
>
>       *endif*
>
> !
>
> !   --n is the number of shared memory allocations to be done successively:
>
>       n = 9   !  ERROR: until n=8 it works fine, but for n>8 it does not
> work
>
> !                *****  then we get a segmentation fault in sbr
> MPI_WIN_ALLOCATE_SHARED on all processes
>
>       *if*(myrankWORLD == 0) *print* *,'=========== number of allocations
> to be done: n=',n
>
> !
>
>       *do* i=1,n
>
> !     --number of desired array elements in the integer-array
> int4_pointer_arr_1(:)  to be allocated shared:
>
>         idim_1 = 100
>
> !     --shared memory allocation:
>
>         *if*(myrankWORLD == 0) *print* *,'=========== allocation no. i=',i
> ,' starting now'
>
>         *call* sharedmem_alloc( idim_1 )
>
>       *enddo*
>
> !
>
>       *call* MPI_FINALIZE( ierr_mpi )
>
>       *print* *,'===============end of program sharedmemtest reached
> ============'
>
>    *end* *program* sharedmemtest
>
> !
>
> !
>
> !
>
> !
>
>       *subroutine* sharedmem_alloc( idim_1 )
>
>
>
>       *use* MYMPI, *only*:  MPI_COMM_WORLD, MPI_ADDRESS_KIND,
> MPI_INFO_NULL, MPI_SUCCESS &
>
>                        ,myrankWORLD, numprocsWORLD
>
>       *use*, *intrinsic* ::  ISO_C_BINDING, only:  *C_PTR*, *C_F_POINTER*
> ! <-- is std Ftn2003 intrinsic module
>
>       *implicit* *none*
>
>       *integer*, *intent*(IN)             ::  idim_1
>
>       *type*    (*C_PTR*)                 ::  memory_pointer
>
>       *integer* (*kind*=MPI_ADDRESS_KIND) ::  memory_size,
> ibytes_per_element
>
>       *logical*                         ::  lnodemaster
>
>       *integer*                         ::  MPIwin
>
>       *integer*, *save*                   ::  idisplace_unit  = 1 &
>
>                                          ,irank_nodemaster= 0  !
> nodemaster has rank 0 in communicator comm_NODEPROCS(mynode)
>
>       *integer*, *dimension*(1)           ::  idim_arr_1
>
>       *integer*                         ::  myrank, mynode, ierr_mpi
>
>       *integer*, *dimension*(:), *allocatable*, *save* ::  comm_NODEPROCS
> ! array consisting of the node communicators
>
>       *integer*, *dimension*(:), *pointer*  ::  int4_pointer_arr_1 =>
> *null*()  ! <-- the array to be allocated shared
>
> !
>
> !
>
> !     for simplicity of this example program we presume all processes
> running on only 1 node:
>
>       *if*( .*not*. *allocated*(comm_NODEPROCS) ) *allocate*(
> comm_NODEPROCS(1) )
>
>       mynode=1  ;  comm_NODEPROCS(mynode) = MPI_COMM_WORLD
>
>       myrank = myrankWORLD
>
>       lnodemaster=.*false*.  ;  *if*(myrank == 0) lnodemaster=.*true*.
>
> !
>
> !-----shared memory allocation starting here:
>
>         ibytes_per_element = 4_MPI_ADDRESS_KIND  ! [Bytes]
>
> !
>
>         *if*(lnodemaster) *then*
>
> !         we let the nodemaster, i.e. the process with rank 0 in
> communicator  comm_NODEPROCS(mynode)
>
> !         of node no.  mynode  allocate the shared memory
>
>           memory_size = *int*(idim_1, MPI_ADDRESS_KIND)  *
> ibytes_per_element
>
>         *else*
>
>           memory_size = 0_MPI_ADDRESS_KIND
>
>         *endif*
>
> !
>
>       *call* MPI_WIN_ALLOCATE_SHARED( memory_size, idisplace_unit,
> MPI_INFO_NULL, comm_NODEPROCS(mynode) &  ! <--input
>
>                                    ,memory_pointer, MPIwin, ierr_mpi
> )                                    ! <--result
>
>           *if*(ierr_mpi /= MPI_SUCCESS) *stop* '=== STOP: Error when
> calling sbr MPI_WIN_ALLOCATE_SHARED'
>
>           *print* *,'  %%after MPI_WIN_ALLLOCATE_SHARED: memory_size   =',
> memory_size
>
>           *print* *,'  %%after MPI_WIN_ALLLOCATE_SHARED: idisplace_unit=',
> idisplace_unit
>
>           *print* *,'%%%%after MPI_WIN_ALLLOCATE_SHARED: memory_pointer=',
> memory_pointer
>
> !
>
>           memory_size= -7777  ;  idisplace_unit= -6666  ! <-- we
> reinitialize to detect wrong values from MPI_WIN_SHARED_QUERY
>
>       *call* MPI_WIN_SHARED_QUERY( MPIwin, irank_nodemaster
> &  ! <--input
>
>                                 ,memory_size, idisplace_unit,
> memory_pointer, ierr_mpi )  ! <--result
>
>           *if*(ierr_mpi /= MPI_SUCCESS) *stop* '=== STOP: Error when
> calling sbr MPI_WIN_SHARED_QUERY'
>
>           *print* *,'  §§§after MPI_WIN_SHARED_QUERY:    memory_size   =',
> memory_size
>
>           *print* *,'  §§§after MPI_WIN_SHARED_QUERY:    idisplace_unit=',
> idisplace_unit
>
>           *print* *,'§§§§§after MPI_WIN_SHARED_QUERY:    memory_pointer=',
> memory_pointer
>
> !
>
>           idim_arr_1(1) = idim_1   ! <-- necessary
>
>       *call* *C_F_POINTER*( memory_pointer, int4_pointer_arr_1, idim_arr_1
> )  ! is a std Ftn2003 routine
>
> !
>
>       *call* MPI_BARRIER( comm_NODEPROCS(mynode), ierr_mpi )
>
> !-----shared allocation finished here.
>
> !
>
> !-----checking for correct shared allocation:
>
>       *if*(lnodemaster) *then*
>
>         int4_pointer_arr_1(:)= 10   ! [1...idim_1]
>
>         *print* *,'========on nodemaster: sum(int4_pointer_arr_1)=',*sum*(
> int4_pointer_arr_1)
>
>       *endif*
>
>       *call* MPI_BARRIER( comm_NODEPROCS(mynode), ierr_mpi )
>
> !
>
>       *print* *,'%%%%%%%%%checking shared allocation:
> sum(int4_pointer_arr_1)=',*sum*(int4_pointer_arr_1)
>
> !
>
>       *return*
>
>       *end* *subroutine* sharedmem_alloc
>
>
>
>
>
> ---end of this email ---
>
>
>
>
>
> *Von:* Panda, Dhabaleswar [mailto:panda at cse.ohio-state.edu]
> *Gesendet:* Mittwoch, 28. Mai 2014 20:23
> *An:* Rachner, Michael
> *Cc:* mvapich-discuss at cse.ohio-state.edu
> *Betreff:* Re: [mvapich-discuss] WG: Bug in MPI_WIN_SHARED_QUERY when
> called from Fortran95-code
>
>
>
> Hi,
>
>
>
> Thanks for providing the details. We have verified that this function
> works with the latest MVAPICH2 2.0-rc2 version. There were issues with this
> function in MVAPICH2 2.0-b which has been fixed.
>
>
>
> Thus, your program seems to be fine. Please use the latest version (RC2)
> and you will be good.
>
>
>
> Thanks,
>
>
>
> DK
>
> Sent from my iPhone
>
>
> On May 28, 2014, at 10:12 AM, "Michael.Rachner at dlr.de" <
> Michael.Rachner at dlr.de> wrote:
>
>
>
>
>
> *Von:* Rachner, Michael
> *Gesendet:* Mittwoch, 28. Mai 2014 15:45
> *An:* 'Panda, Dhabaleswar'
> *Betreff:* AW: [mvapich-discuss] Bug in MPI_WIN_SHARED_QUERY when called
> from Fortran95-code
>
>
>
> Dear Mr. Panda,
>
>
>
> On the clusters I can only use what is installed.
>
> Until some weeks ago, they had installed MVAPICH2-2.0b  on the NEC Nehalem
> cluster.
>
> Then they replaced that version by  the older (more stable?) version
> MVAPICH2-1.9 .
>
> With both versions the same MPI-3 shared memory allocation problem
> occurred.
>
> (without using the MPI-3 shared memory feature both versions are running
> well).
>
>
>
> Do you think it is a bug in MVAPICH, or  have I done something wrong when
> invoking the MPI-routines?
>
>
>
> Greetings
>
> Michael Rachner
>
>
>
> *Von:* Panda, Dhabaleswar [mailto:panda at cse.ohio-state.edu
> <panda at cse.ohio-state.edu>]
> *Gesendet:* Mittwoch, 28. Mai 2014 13:57
> *An:* Rachner, Michael; mvapich-discuss at cse.ohio-state.edu
> *Betreff:* RE: [mvapich-discuss] Bug in MPI_WIN_SHARED_QUERY when called
> from Fortran95-code
>
>
>
> Hi,
>
> Thanks for your note. Have you tried MVAPICH2 2.0-rc2 (the latest version
> released a few days
> back)? There have been many enhancements and performance optimizations
> related to MPI-3 RMA support in the 2.0 series. The MVAPICH2 2.0 GA will
> be available soon. Please try this version and let us know if you still
> encounter the
> issue.
>
> Thanks,
>
> DK
>  ------------------------------
>
> *From:* mvapich-discuss-bounces at cse.ohio-state.edu on behalf of
> Michael.Rachner at dlr.de [Michael.Rachner at dlr.de]
> *Sent:* Wednesday, May 28, 2014 5:12 AM
> *To:* mvapich-discuss at cse.ohio-state.edu
> *Subject:* [mvapich-discuss] Bug in MPI_WIN_SHARED_QUERY when called from
> Fortran95-code
>
> Dear MPI-Developers,
>
>
>
> I am MPI-parallelizing a Fortran95 code. In order to let the same
> Fortran-arrays be shared by the MPI-processes on the same node,
>
> I employ the 3 routines:     MPI_WIN_ALLOC_SHARED, MPI_WIN_SHARED_QUERY
> and  C_F_POINTER
>
> That worked with  MPICH-3.0.4  on a LINUX Nehalem-Cluster  Cluster4
> (although even there the result quantities from MPI_WIN_SHARED_QUERY were
> not correct),
>
> but did not work with MVAPICH2-1.9 on another LINUX Cluster (NEC
> Nehalem-Cluster  Laki ) .
>
>
>
> I “downsized” the problem from the large original code to a small example
> Ftn95-program  sharedmemtest, given  below.
>
> It reveals  a problem with the 3 result-quantities   memory_size,
> idisplace_unit, memory_pointer   returned from  sbr MPI_WIN_SHARED_QUERY
>
> (see also the red comments  pasted into the outputs from MPICH and
> MVAPICH).  These values after the call may be wrong or not.
>
> There might be something wrong inside  sbr MPI_WIN_SHARED_QUERY (or is it
> a problem with the data types of the quantities on the parameter list?).
>
>
>
>     The main problem is:   Using MVAPICH  (but not with MPICH-3.0.4) the
> sbr MPI_WIN_SHARED_QUERY  always returned a
>
>                                              memory_pointer of   0   for
> all slaves.
>
>                                              With that seemingly wrong
> C-address the succeeding call of sbr C_F_POINTER (although it issues no
> error message),
>
>                                              cannot assign the pointer
> array (named  int4_pointer_arr_1 )  on the slave process to that target
> address.
>
>                                              Consequently, the first usage
> of that array  on a slave process gives the error message documented below
>
>                     “Attempt to use pointer INT4_POINTER_ARR_1 when it is
> not associated with a target” .
>
>                     I observed that same behavior also with my large
> original code.
>
>     Hint: I found no influence on the results of the MPICH-run and on the
> results of the MVAPICH-run  when using the
>
>           MPI-module rather than the mpif.h file. But I found, that there
> exists
>
>           no interface in the MPI module of MPICH and MVAPICH  for the
> routines  MPI_WIN_ALLOC_SHARED  and  MPI_WIN_SHARED_QUERY .
>
>           This is in contrast to the MPI-3.0 standard pdf-document (of
> Sept. 21, 2012, chapter 11.2, pp. 409 + 411),
>
>           where they provide these 2 interfaces. And both interfaces
> switch into another routine ~_CPTR using a different type
>
>           for  BASEPTR (= memory_pointer  in my code).
>
>           Can the missing of these interfaces be the cause of the trouble?
>
>
>
>
>
> The MPI-3 shared memory feature is a breakthrough for the MPI-Users, and
> very useful for me. If I could make it run…
>
>   Can you help me?
>
>
>
>   Thank You for any hints
>
>     Michael Rachner
>
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20141020/afc0e917/attachment-0001.html>


More information about the mvapich-discuss mailing list