[mvapich-discuss] Bug in MPI_WIN_SHARED_QUERY when called from Fortran95-code
Michael.Rachner at dlr.de
Michael.Rachner at dlr.de
Wed May 28 05:12:08 EDT 2014
Dear MPI-Developers,
I am MPI-parallelizing a Fortran95 code. In order to let the same Fortran-arrays be shared by the MPI-processes on the same node,
I employ the 3 routines: MPI_WIN_ALLOC_SHARED, MPI_WIN_SHARED_QUERY and C_F_POINTER
That worked with MPICH-3.0.4 on a LINUX Nehalem-Cluster Cluster4 (although even there the result quantities from MPI_WIN_SHARED_QUERY were not correct),
but did not work with MVAPICH2-1.9 on another LINUX Cluster (NEC Nehalem-Cluster Laki ) .
I "downsized" the problem from the large original code to a small example Ftn95-program sharedmemtest, given below.
It reveals a problem with the 3 result-quantities memory_size, idisplace_unit, memory_pointer returned from sbr MPI_WIN_SHARED_QUERY
(see also the red comments pasted into the outputs from MPICH and MVAPICH). These values after the call may be wrong or not.
There might be something wrong inside sbr MPI_WIN_SHARED_QUERY (or is it a problem with the data types of the quantities on the parameter list?).
The main problem is: Using MVAPICH (but not with MPICH-3.0.4) the sbr MPI_WIN_SHARED_QUERY always returned a
memory_pointer of 0 for all slaves.
With that seemingly wrong C-address the succeeding call of sbr C_F_POINTER (although it issues no error message),
cannot assign the pointer array (named int4_pointer_arr_1 ) on the slave process to that target address.
Consequently, the first usage of that array on a slave process gives the error message documented below
"Attempt to use pointer INT4_POINTER_ARR_1 when it is not associated with a target" .
I observed that same behavior also with my large original code.
Hint: I found no influence on the results of the MPICH-run and on the results of the MVAPICH-run when using the
MPI-module rather than the mpif.h file. But I found, that there exists
no interface in the MPI module of MPICH and MVAPICH for the routines MPI_WIN_ALLOC_SHARED and MPI_WIN_SHARED_QUERY .
This is in contrast to the MPI-3.0 standard pdf-document (of Sept. 21, 2012, chapter 11.2, pp. 409 + 411),
where they provide these 2 interfaces. And both interfaces switch into another routine ~_CPTR using a different type
for BASEPTR (= memory_pointer in my code).
Can the missing of these interfaces be the cause of the trouble?
The MPI-3 shared memory feature is a breakthrough for the MPI-Users, and very useful for me. If I could make it run...
Can you help me?
Thank You for any hints
Michael Rachner
************************ This the Ftn95-code (contents of file sharedmemtest.f90) to demonstrate the bug: ***********
!
module MYMPI
!%%% use MPI
include "mpif.h"
end module MYMPI
!
!
program sharedmemtest
! This code demonstrates a bug with MVAPICH2-1.9 (and MPICH-3.0.4) in the MPI-3 shared memory alloc. of a 1d-integer-array.
! This code was compiled with INTEL-14.0.1: mpif90 -O0 -debug -traceback -check -fpe0 sharedmemtest.f90
! This code was launched with: mpiexec -np 2 -bind-to core -prepend-rank ./a.out
!
use MYMPI, only: MPI_COMM_WORLD, MPI_MAX_LIBRARY_VERSION_STRING &
,MPI_ADDRESS_KIND, MPI_INFO_NULL, MPI_SUCCESS
use, intrinsic :: ISO_C_BINDING, only: C_PTR, C_F_POINTER ! <-- is std Ftn2003 intrinsic module
implicit none
type (C_PTR) :: memory_pointer
integer (kind=MPI_ADDRESS_KIND) :: memory_size, ibytes_per_element
logical :: lnodemaster
integer :: MPIwin, idim_1
integer, save :: idisplace_unit = 1 &
,irank_nodemaster= 0 ! nodemaster has rank 0 in communicator comm_NODEPROCS(mynode)
integer, dimension(1) :: idim_arr_1
integer :: myrank, myrankWORLD, numprocsWORLD, mynode, ierr_mpi
integer, dimension(:), allocatable :: comm_NODEPROCS ! array consisting of the node communicators
integer, dimension(:), pointer :: int4_pointer_arr_1 =>null() ! <-- the array to be allocated shared
character(len=MPI_MAX_LIBRARY_VERSION_STRING) :: versionstring
integer :: iresultlen
!
!
! --initialize MPI:
call MPI_INIT( ierr_mpi )
call MPI_COMM_RANK( MPI_COMM_WORLD, myrankWORLD , ierr_mpi )
call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocsWORLD, ierr_mpi )
print *,'=== ftn95-program sharedmemtest has been entered by process no.: ',myrankWORLD
call MPI_BARRIER( MPI_COMM_WORLD, ierr_mpi )
!
call MPI_GET_LIBRARY_VERSION( versionstring, iresultlen, ierr_mpi )
if(myrankWORLD == 0) then
write(6,*) 'Version of MPI library used in this run:'
write(6,'(a)') versionstring(1:iresultlen)
endif
!
! number of desired array elements in the integer-array int4_pointer_arr_1(:) to be allocated shared:
idim_1 = 100
!
! for simplicity of this example program we presume all processes running on only 1 node:
allocate( comm_NODEPROCS(1) )
mynode=1 ; comm_NODEPROCS(mynode) = MPI_COMM_WORLD
myrank = myrankWORLD
lnodemaster=.false. ; if(myrank == 0) lnodemaster=.true.
!
!-----shared memory allocation starting here:
ibytes_per_element = 4_MPI_ADDRESS_KIND ! [Bytes]
!
if(lnodemaster) then
! we let the nodemaster, i.e. the process with rank 0 in communicator comm_NODEPROCS(mynode)
! of node no. mynode allocate the shared memory
memory_size = int(idim_1, MPI_ADDRESS_KIND) * ibytes_per_element
else
memory_size = 0_MPI_ADDRESS_KIND
endif
!
call MPI_WIN_ALLOCATE_SHARED( memory_size, idisplace_unit, MPI_INFO_NULL, comm_NODEPROCS(mynode) & ! <--input
,memory_pointer, MPIwin, ierr_mpi ) ! <--result
if(ierr_mpi /= MPI_SUCCESS) stop '=== STOP: Error when calling sbr MPI_WIN_ALLOCATE_SHARED'
print *,' %%after MPI_WIN_ALLLOCATE_SHARED: memory_size, idisplace_unit=',memory_size,idisplace_unit
print *,'%%%%after MPI_WIN_ALLLOCATE_SHARED: memory_pointer=',memory_pointer
!
memory_size= -7777 ; idisplace_unit= -6666 ! <-- we reinitialize to detect wrong values from MPI_WIN_SHARED_QUERY
call MPI_WIN_SHARED_QUERY( MPIwin, irank_nodemaster & ! <--input
,memory_size, idisplace_unit, memory_pointer, ierr_mpi ) ! <--result
if(ierr_mpi /= MPI_SUCCESS) stop '=== STOP: Error when calling sbr MPI_WIN_SHARED_QUERY'
print *,' §§§after MPI_WIN_SHARED_QUERY: memory_size, idisplace_unit=',memory_size,idisplace_unit
print *,'§§§§§after MPI_WIN_SHARED_QUERY: memory_pointer=',memory_pointer
!
idim_arr_1(1) = idim_1 ! <-- necessary
call C_F_POINTER( memory_pointer, int4_pointer_arr_1, idim_arr_1 ) ! is a std Ftn2003 routine
!
call MPI_BARRIER( comm_NODEPROCS(mynode), ierr_mpi )
!-----shared allocation finished here.
!
!-----checking for correct shared allocation:
if(lnodemaster) then
int4_pointer_arr_1(:)= 10 ! [1...idim_1]
print *,'========on nodemaster: sum(int4_pointer_arr_1)=',sum(int4_pointer_arr_1)
endif
call MPI_BARRIER( comm_NODEPROCS(mynode), ierr_mpi )
!
! with MVAPICH2-1.9 the next stmt causes an ERROR ABORT of program on the slaves,
! with message "Attempt to use pointer INT4_POINTER_ARR_1 when it is not associated with a target":
print *,'%%%%%%%%%checking shared allocation: sum(int4_pointer_arr_1)=',sum(int4_pointer_arr_1)
!
call MPI_FINALIZE( ierr_mpi )
print *,'===============end of program sharedmemtest reached ============'
end program sharedmemtest
********************************* This the output of that program using MPICH-3.0.4 on Cluster4: *********************
rachner at master:~/dat>
rachner at master:~/dat> mpif90 -O0 -debug -traceback -check -fpe0 sharedmemtest.f90
rachner at master:~/dat> mpiexec -np 2 -bind-to core -prepend-rank ./a.out
[1] === ftn95-program sharedmemtest has been entered by process no.: 1
[0] === ftn95-program sharedmemtest has been entered by process no.: 0
[0] Version of MPI library used in this run:
[0] MPICH Version: 3.0.4
[0] MPICH Release date: Wed Apr 24 10:08:10 CDT 2013
[0] MPICH Device: ch3:nemesis
[0] MPICH configure: CC=icc CXX=icpc F77=ifort FC=ifort --with-tm=/opt/torque-2.5.2 --prefix=/export/opt/ompi-3.0.4-intel2013
[0] MPICH CC: icc -O2
[0] MPICH CXX: icpc -O2
[0] MPICH F77: ifort -O2
[0] MPICH FC: ifort -O2
[0]
[0] %%after MPI_WIN_ALLLOCATE_SHARED: memory_size = 400
[0] %%after MPI_WIN_ALLLOCATE_SHARED: idisplace_unit= 1
[0] %%%%after MPI_WIN_ALLLOCATE_SHARED: memory_pointer= 139789383376896
[0] §§§after MPI_WIN_SHARED_QUERY: memory_size = 400
[1] %%after MPI_WIN_ALLLOCATE_SHARED: memory_size = 0
[1] %%after MPI_WIN_ALLLOCATE_SHARED: idisplace_unit= 1
[1] %%%%after MPI_WIN_ALLLOCATE_SHARED: memory_pointer= 0
[1] §§§after MPI_WIN_SHARED_QUERY: memory_size = 400
[1] §§§after MPI_WIN_SHARED_QUERY: idisplace_unit= -6666 <--idisplace_unit=-6666 is wrong here!! (must be 1)
[1] §§§§§after MPI_WIN_SHARED_QUERY: memory_pointer= 140132715147264
[0] §§§after MPI_WIN_SHARED_QUERY: idisplace_unit= -6666 <--idisplace_unit=-6666 is wrong here!! (must be 1)
[0] §§§§§after MPI_WIN_SHARED_QUERY: memory_pointer= 139789383376896
[0] ========on nodemaster: sum(int4_pointer_arr_1)= 1000
[0] %%%%%%%%%checking shared allocation: sum(int4_pointer_arr_1)= 1000
[1] %%%%%%%%%checking shared allocation: sum(int4_pointer_arr_1)= 1000
[1] ===============end of program sharedmemtest reached ============
[0] ===============end of program sharedmemtest reached ============
rachner at master:~/dat>
********************************* This the output of that program using MVAPICH2-1.9 on Laki: *********************
d0000000 cl3fr1 214$mpif90 -O0 -debug -traceback -check -fpe0 sharedmemtest.f90
d0000000 cl3fr1 215$mpiexec -np 2 -bind-to core -prepend-rank ./a.out
[0] === ftn95-program sharedmemtest has been entered by process no.: 0
[1] === ftn95-program sharedmemtest has been entered by process no.: 1
[0] Version of MPI library used in this run:
[0] MPICH Version: 1.9
[0] MPICH Release date: Mon May 6 12:25:08 EDT 2013
[0] MPICH Device: ch3:mrail
[0] MPICH configure: --prefix=/opt/mpi/mvapich2/1.9-intel-14.0.1 --enable-shared --enable-sharedlibs=gcc --with-file-system=lustre
[0] MPICH CC: icc -DNDEBUG -DNVALGRIND -O2
[0] MPICH CXX: icpc -DNDEBUG -DNVALGRIND -O2
[0] MPICH F77: ifort -L/lib -L/lib -O2
[0] MPICH FC: ifort -O2
[0]
[0] %%after MPI_WIN_ALLLOCATE_SHARED: memory_size = 400
[0] %%after MPI_WIN_ALLLOCATE_SHARED: idisplace_unit= 1
[0] %%%%after MPI_WIN_ALLLOCATE_SHARED: memory_pointer= 39920336
[0] §§§after MPI_WIN_SHARED_QUERY: memory_size = 400
[0] §§§after MPI_WIN_SHARED_QUERY: idisplace_unit= 1
[0] §§§§§after MPI_WIN_SHARED_QUERY: memory_pointer= 39920336
[0] ========on nodemaster: sum(int4_pointer_arr_1)= 1000
[1] %%after MPI_WIN_ALLLOCATE_SHARED: memory_size = 0
[1] %%after MPI_WIN_ALLLOCATE_SHARED: idisplace_unit= 1
[1] %%%%after MPI_WIN_ALLLOCATE_SHARED: memory_pointer= 0
[1] §§§after MPI_WIN_SHARED_QUERY: memory_size = 0 <-- memory_size=0 is wrong here!! (must be 100*4=400)
[1] §§§after MPI_WIN_SHARED_QUERY: idisplace_unit= 1
[1] §§§§§after MPI_WIN_SHARED_QUERY: memory_pointer= 0 <-- memory_pointer=0 might be wrong here!! (this happens for all slaves)
[0] %%%%%%%%%checking shared allocation: sum(int4_pointer_arr_1)= 1000
[1] forrtl: severe (408): fort: (7): Attempt to use pointer INT4_POINTER_ARR_1 when it is not associated with a target <-- you get this for all slaves
[1]
[1] Image PC Routine Line Source
[1] a.out 00000000004096E8 MAIN__ 94 sharedmemtest.f90 <-- line 94 is the print-stmt before call of MPI_FINALIZE
[1] a.out 0000000000407996 Unknown Unknown Unknown
[1] libc.so.6 000000346761ED1D Unknown Unknown Unknown
[1] a.out 0000000000407889 Unknown Unknown Unknown
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 152
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
d0000000 cl3fr1 216$
--end of the email--
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140528/d46e3eb0/attachment-0001.html>
More information about the mvapich-discuss
mailing list