[mvapich-discuss] MPI-IO Inconsistency over Lustre using MVAPICH

Weikuan Yu weikuan.yu at gmail.com
Wed Mar 4 14:21:40 EST 2009


I can't reproduce the error with 2-48 processes with mvapich2-1.2p1.
Adding a prefix "lustre:" or "ufs:" produced the same correct results.
The system I used has the following Lustre configuration.
lustre: 1.6.4
kernel: 47
build:
1.6.4-19691231180000-PRISTINE-.usr.src.linux-2.6.9-55.0.9.EL_lustre.1.6.4-2.6.9-55.0.9.EL_lustre.1.6.4custom

On the same system, I also tried mvapich-1.1. There was no error with
the Lustre ADIO driver.

Nathana, let me know where you need more help. You might want to check 
or update your Lustre configuration.

Thanks,
--Weikuan

Robert Latham wrote:
> On Wed, Mar 04, 2009 at 09:50:16AM -0600, Rajeev Thakur wrote:
>> Nathan,
>>        Can you check if it works if you add the prefix "ufs:" to the file
>> name in all opens?
> 
> Two things come to my eye:
> - any chance your site will upgrade to lustre 1.6?  1.4 should perform
>   *correctly* but very very slowly for you
> - You have no error checking for any of your MPI_FILE routines.
>   perhaps you omitted them for the sake of this example, but there's a
>   chance that the MPI_FILE_* routine is trying to tell you what went
>   wrong. 
> 
>   Compare the 'ierr' result with MPI_SUCCESS.  if not equal, convert
>   the error code to something human readable with MPI_ERROR_STRING 
>   http://www.mpi-forum.org/docs/mpi21-report-bw/node186.htm#Node186
> 
> Thanks
> ==rob
> 




More information about the mvapich-discuss mailing list