[mvapich-discuss] parallel file read -> "cannot allocate memory for the file buffer"

Nathan Dauchy Nathan.Dauchy at noaa.gov
Thu Oct 18 17:47:49 EDT 2007


Abhinav Vishnu wrote:
> From MVAPICH 0.9.9 onwards, we have converted almost all features
> as run-time variable. With this support, you will not need to change the
> MPI
> source for enabling/disabling features. For MVAPICH, the SRQ usage
> variable can be controlled by VIADEV_USE_SRQ:
> 
> http://mvapich.cse.ohio-state.edu/support/mvapich_user_guide.html#x1-1100009.4.1
> 
>> I have not yet figured out how to disable SRQ in MVAPICH2.
>>   
> In MVAPICH2, the usage of SRQ can be controlled by using MV2_USE_SRQ.
> Details with respect to this variable are present here:
> http://mvapich.cse.ohio-state.edu/support/user_guide_mvapich2.html#x1-12000010.46
> 

Thanks, those pointers are very helpful, and certainly preferable to
modifying the source code.

>>
>> 1) Can anyone duplicate our problem with the above code?
>>   
> We are taking a look at it.
>> 2) Does the code violate MPI standards or exceed MVAPICH limitations?
>>   
> From MVAPICH/MVAPICH2 perspective, there should be no limitations IMO.
>> 3) Is there a change to the MPI stack or runtime environment that will
>> avoid the problem?
>>   
> As you have mentioned earlier, disabling SRQ seems to solve the problem
> for you. Unfortunately, at this point, we do not have much insight with
> respect to the root cause of the problem. Please let us know the outcome
> of your experimentation by disabling SRQ for MVAPICH/MVAPICH2.
> 

It may be that the problem was introduce with MVAPICH-0.9.9, not
necessarily with SRQ.  We have not run this test on MVAPICH-0.9.8 WITH
SRQ.  I can build and test that, but it may take a while.

In the meantime, I tested the following:

	* kernel-2.6.9-55.ELsmp, OFED-1.2, MVAPICH-0.9.9
		mpirun_rsh -paramfile mpirun.params
	  Where mpirun.params includes "VIADEV_USE_SRQ=0"

	* kernel-2.6.20.20, OFED-1.2.5.1, MVAPICH2-1.0
		mpiexec -env MV2_USE_SRQ 0

I got the "forrtl: severe (98): cannot allocate memory for the file
buffer" error in both cases.

Thanks,
Nathan



More information about the mvapich-discuss mailing list