[mvapich-discuss] open non-existing files on Lustre expecting MPI_ERR_NO_SUCH_FILE but got MPI_SUCCESS

Hari Subramoni subramoni.1 at osu.edu
Tue Feb 14 10:19:33 EST 2017


Hi Mark,

Glad to hear that things are working as expected now. This will be
available with the next release of MVAPICH2.

Best Regards,
Hari.

On Feb 14, 2017 10:16 AM, "Mark Dixon" <m.c.dixon at leeds.ac.uk> wrote:

> Hi all,
>
> Have rebuilt mvapich2 with the patch to ad_lustre_open.c and retested with
> Wei-keng's test program. Things look much better now, thanks very much! :)
>
> $ mpicc -o test_open_no_such_file test_open_no_such_file.c
>
> $ mpiexec -n 1 ./test_open_no_such_file /tmp/non-exist-file-on-ext4
> non-exiting file "/tmp/non-exist-file-on-ext4"
> MPI error string: File does not exist, error stack:
> ADIOI_UFS_OPEN(69): File /tmp/non-exist-file-on-ext4 does not exist
> Error class = MPI_ERR_NO_SUCH_FILE
>
> $ mpiexec -n 1 ./test_open_no_such_file /nobackup/non-exist-file-on-lustre
> non-exiting file "/nobackup/non-exist-file-on-lustre"
> MPI error string: File does not exist, error stack:
> ADIOI_LUSTRE_OPEN(69): File /nobackup/non-exist-file-on-lustre does not
> exist
> Error class = MPI_ERR_NO_SUCH_FILE
>
> All the best,
>
> Mark
>
>
> On Fri, 10 Feb 2017, Mark Dixon wrote:
>
> Hi all,
>>
>> Yes, will test - really sorry for the delay, am out of the office most of
>> this week.
>>
>> Cheers,
>>
>> Mark
>>
>> On Thu, 9 Feb 2017, Wei-keng Liao wrote:
>>
>>  Hi, Hari
>>>
>>>  I do not have an access to a machine with infiniband and Lustre.
>>>  I have forwarded your patch to Mark Dixon and he will give it a try
>>>  and get back to you soon. Please keep him in the loop. Thanks.
>>>
>>>
>>>  Wei-keng
>>>
>>>  On Feb 7, 2017, at 4:09 PM, Hari Subramoni wrote:
>>>
>>> >  Hello Wei-keng,
>>> > >  Can you please try the attached patch and see if things work for
>>> you?
>>> > >  This patch will be available by default with future releases of the
>>> >  MVAPICH2 library.
>>> > >  Best Regards,
>>> >  Hari.
>>> > >  On Tue, Feb 7, 2017 at 4:50 AM, Hari Subramoni <subramoni.1 at osu.edu>
>>> >  wrote:
>>> >  Hello Wei-keng,
>>> > >  Many thanks for the report. We will take a look at it and get back
>>> to >  you soon.
>>> > >  Best Regards,
>>> >  Hari.
>>> > >  On Feb 7, 2017 1:12 AM, "Wei-keng Liao" <
>>> wkliao at eecs.northwestern.edu> >  wrote:
>>> >  Hi,
>>> > >  I am a developer of PnetCDF library and would like to file a bug
>>> report.
>>> >  Mark Dixon, a PnetCDF user cc-ed in this email, reported an error when
>>> >  building PnetCDF with mvapich2-2.2. The root cause of the error is
>>> due >  to
>>> >  mvapich2-2.2 fails to return the expected MPI error class >
>>> MPI_ERR_NO_SUCH_FILE
>>> >  when a test program tries to open a non-existing file on Lustre.
>>> > >  After some diggings, I found out that the Lustre driver in mvapich2
>>> >  appends
>>> >  O_CREAT flag to the mode argument of the open call, (line 50, in file
>>> >  src/mpi/romio/adio/ad_lustre/ad_lustre_open.c). Because of that, the
>>> >  file is
>>> >  mistakenly created and MPI_SUCCESS is returned, which is an
>>> unexpected >  outcome
>>> >  by MPI standard.
>>> > >  Attached is a small MPI program to reproduce such error. Mark has
>>> used >  it to
>>> >  verify the issue described above. He also ran it against an ext4
>>> (UFS) >  and the
>>> >  correct error class was returned. So the problem is specific for
>>> Lustre >  driver.
>>> >  Please see the discussion email threads:
>>> >  http://lists.mcs.anl.gov/pipermail/parallel-netcdf/2017-
>>> February/001897.html
>>> > > >  Wei-keng
>>> > > >  _______________________________________________
>>> >  mvapich-discuss mailing list
>>> >  mvapich-discuss at cse.ohio-state.edu
>>> >  http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>> > > >  <lustre_open.patch>
>>>
>>>
>>>
>> --
>> -------------------------------------------------------------------
>> Mark Dixon                         Email    : m.c.dixon at leeds.ac.uk
>> Advanced Research Computing (ARC)  Tel (int): 35429
>> IT Services building               Tel (ext): +44(0)113 343 5429
>> University of Leeds, LS2 9JT, UK
>> -------------------------------------------------------------------
>>
>>
> --
> -------------------------------------------------------------------
> Mark Dixon                         Email    : m.c.dixon at leeds.ac.uk
> Advanced Research Computing (ARC)  Tel (int): 35429
> IT Services building               Tel (ext): +44(0)113 343 5429
> University of Leeds, LS2 9JT, UK
> -------------------------------------------------------------------
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20170214/206ba730/attachment-0001.html>


More information about the mvapich-discuss mailing list