[mvapich-discuss] MPI_File_read_at_all external32 bug in MV2-2.1
Adam T. Moody
moody20 at llnl.gov
Fri Oct 30 14:48:35 EDT 2015
Hello MVAPICH team,
I've hit a bug in MPI_File_read_at_all in MVAPICH2-2.1. I have an
application that reads and writes files in external32 format. It writes
the file just fine, but it throws the following error when reading the
file back:
internal ABORT - process 0
srun: error: rzmerl2: task 0: Exited with exit code 1
Assertion failed in file src/mpid/common/datatype/mpid_ext32_segment.c
at line 277: FALSE
memcpy argument memory ranges overlap, dst_=0x2aaab5c160a8
src_=0x2aaab5c160a8 len_=176
Above, MPI is trying to do a memcpy where the source and destination
buffer are the same address. Looking through the code for MVAPICH2-2.1,
the problem seems to be at line 132 in src/mpi/romio/mpi-io/read_all.c:
if (e32_buf != NULL) {
error_code = MPIU_read_external32_conversion_fn(xbuf, datatype,
count, e32_buf);
ADIOI_Free(e32_buf);
}
I think the fix is to change "xbuf" above to "buf" as it is in read.c below:
if (e32_buf != NULL) {
error_code = MPIU_read_external32_conversion_fn(buf, datatype,
count, e32_buf);
ADIOI_Free(e32_buf);
}
When in external32 mode, xbuf == e32_buf, which acts as a temporary
buffer in which to read the data. The code is then meant to unpack and
convert the data from the temporary buffer into the user buffer at buf.
It's probably worth checking the other external32 code paths to look for
similar bugs.
Thanks,
-Adam
More information about the mvapich-discuss
mailing list