[mvapich-discuss] parallel file read -> "cannot allocate memory for
the file buffer"
Nathan Dauchy
Nathan.Dauchy at noaa.gov
Thu Oct 18 13:37:56 EDT 2007
Greetings all, and apologies in advance for the long posting,
Some of the FORTRAN MPI applications at our site rely on having all
processes open the same file for input data. (I know this is not
necessarily optimal, but we can't change all the codes at this time.)
Several of the applications fail with *some* (5-20) of the MPI tasks
crashing with an error like the following:
forrtl: severe (98): cannot allocate memory for the file buffer - out of
memory, unit 101, file /misc/whome/ndauchy/src/ParaRead/ParaData
Image PC Routine Line Source
ParaRead 00000000004832AB Unknown Unknown Unknown
ParaRead 0000000000481E5E Unknown Unknown Unknown
ParaRead 0000000000466C3E Unknown Unknown Unknown
ParaRead 0000000000445C2E Unknown Unknown Unknown
ParaRead 000000000044588F Unknown Unknown Unknown
ParaRead 0000000000452D30 Unknown Unknown Unknown
ParaRead 0000000000404C45 MAIN__ 18
ParaRead.F90
ParaRead 00000000004049AA Unknown Unknown Unknown
libc.so.6 0000002A95C6C4BB Unknown Unknown Unknown
ParaRead 00000000004048EA Unknown Unknown Unknown
It takes only a moderately size file (1.6M) and 36 to 65 MPI tasks to
trigger the error. At smaller sizes everything works correctly. We
have seen this problem on both our Rapidscale/Terragrid filesystem and
on NFS.
We have constructed a simple (fortran) test case to duplicate the
problem. The first program creates the data file, the second reads it
from many nodes simultaneously.
=====================================================
program writeParaRead
implicit none
integer,parameter :: size=100000
integer,parameter :: u=101
real :: a(size),b(size),c(size),d(size)
a=1
b=2
c=3
d=4
open(u,file="ParaData",form='unformatted')
write(u) a,b,c,d
print*,a(size),b(size),c(size),d(size)
end program writeParaRead
=====================================================
program testParaRead
implicit none
include 'mpif.h'
integer :: Rank,numprocs,MyError,i
integer,parameter :: size=100000
integer,parameter :: u=101
real :: a(size),b(size),c(size),d(size)
call MPI_INIT( MyError )
call MPI_COMM_RANK( MPI_COMM_WORLD, Rank, MyError )
call MPI_COMM_SIZE( MPI_COMM_WORLD, numprocs, MyError )
print *, 'Process ', Rank, ' of ', numprocs, ' is alive'
open(u,file="ParaData",form='unformatted')
call MPI_BARRIER( MPI_COMM_WORLD )
read(u) a,b,c,d
close(u)
print"('output',1i5,4f10.5)",rank,a(size),b(size),c(size),d(size)
call MPI_FINALIZE( MyError)
end program testParaRead
=====================================================
The error has shown up on several combinations of:
* kernel 2.6.9-55.ELsmp, 2.6.9-55.0.6ELsmp, 2.6.20.20
* OFED-1.2, OFED-1.2.5.1
* MVAPICH-0.9.9, MVAPICH2-0.9.8, MVAPICH2-1.0
All tests use the Intel ifort compiler, and the code was simply built
with "mpif90".
Why do I think this is an MVAPICH problem? The error DID NOT occur when
using MVAPICH-0.9.8 with Shared Receive Queue disabled!
We disabled SRQ with the following simple change:
# diff mvapich-0.9.8_clean/mpid/ch_gen2/viaparam.h
mvapich-0.9.8_single_rail_intel_9.1/mpid/ch_gen2/viaparam.h
50a51
> #if 0
53a55
> #endif
I have not yet figured out how to disable SRQ in MVAPICH2.
Initial testing with linux-2.6.20.20, OFED-1.2.5.1, and MVAPICH2-1.0
seemed to raise the number of MPI tasks necessary to trigger the problem
from roughly 36 up to 65.
One last note: I ported the 2nd fortran program to C to try to duplicate
the error there. However, it ran to completion cleanly on 256 cores.
So perhaps the problem is specific to the fortran libraries.
SO, now the questions:
1) Can anyone duplicate our problem with the above code?
2) Does the code violate MPI standards or exceed MVAPICH limitations?
3) Is there a change to the MPI stack or runtime environment that will
avoid the problem?
4) Is there a *simple* change that can be made to the user code to avoid
the problem?
5) How do I disable SRQ in MVAPICH2 to see if that helps at all?
Thanks for your help,
Nathan
More information about the mvapich-discuss
mailing list