[mvapich-discuss] Output large files using MPI-IO

Sayantan Sur surs at cse.ohio-state.edu
Thu Mar 4 09:13:39 EST 2010


Hi Madhu,

Thanks for the report. Would it be possible for you to post the
snippet of code, so that we may reproduce the error at our end?

The problem seems to be something related with ROMIO. We were
wondering if you could try MVAPICH2-1.4 on TACC to see if the problem
has been fixed in more recent versions. There is a 'hecura' module on
Ranger that has MVAPICH2-1.4. Does the bug show up even with the
hecura module?

http://www.tacc.utexas.edu/software_modules.php?app=softwareModule&isTest=&machine=Ranger#version

Thanks.

On Wed, Mar 3, 2010 at 7:16 PM, Madhusudan Pai <mpai at stanford.edu> wrote:
> Hello,
>
> This is probably a novice MPI question but I can't seem to figure out the
> reason behind an error I get when I try to output a large array on Ranger. I
> have created a snippet of my code that can reproduce this error, although I
> have pasted only portions here. I can post the entire code (about 117 lines
> if needed).
>
> Essentially, I use MPI_TYPE_INDEXED to create a view, then I use
> MPI_FILE_SET_VIEW and MPI_FILE_WRITE_ALL to output my file.
>
>  call MPI_TYPE_INDEXED(ncells,blocklength,map,MPI_INTEGER,fileview,ierr)
>  call MPI_TYPE_COMMIT(fileview,ierr)
>
> blocklength and map are 1-d arrays of size ncells, and ncells, blocklength
> and map are declared as integer (kind=4).
>
> Then I set the view and output an array called hexa as
>
>  disp = 0
>  call
> MPI_FILE_SET_VIEW(iunit,disp,MPI_INTEGER,fileview,"native",mpi_info,ierr)
>  call MPI_FILE_WRITE_ALL(iunit,hexa,ncells,MPI_INTEGER,status,ierr)
>
> where hexa is a 1-d array of size ncells. The array hexa contains the global
> node numbering of my mesh (so the last cell contains a value of order
> (ncells*nproc)).
>
> For small problem sizes the code works just fine. But the problem arises
> when ncells is close to the integer*4 limit. And since the array map is a
> function of hexa (specifically, map = hexa * 8), the entries of map also
> cross the integer*4 limit. The routine stalls at MPI_FILE_WRITE_ALL with the
> error "*io Invalid argument**io Invalid argument**io Invalid argument**io...
> " on several processes.
>
> 1) I can't seem to figure out which "argument" is causing this error.
>
> 2) I also changed the type declaration of map and ncells to integer
> (kind=8), but this did not seem to correct the problem. I have also tried
> with MPI_INTEGER8 in the WRITE_ALL routine.
>
> I am using mvapich 1.0.1 and intel 10.1 fortran for compilation.
>
> Any help greatly appreciated!
>
> Thanks,
> Madhu Pai
> Stanford University
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>



-- 
Sayantan Sur

Research Scientist
Department of Computer Science
The Ohio State University.



More information about the mvapich-discuss mailing list