[mvapich-discuss] Need advice on Error code =12 problem only when
running with MPIIO on lustre
Terrence.LIAO at total.com
Terrence.LIAO at total.com
Thu Dec 18 10:56:52 EST 2008
Dear Mvapich-discuss,
I have encountered a very strange IBV_WC_RETRY_EXC_ERR code=12 problem
and need your advise.
This problem only happens when using MPI-IO calls such as
mpi_file_write_all() on lustre.
We are using ofed1.4rc3 on CentOS 5.2. The IB is infinipath SDR HTX.
lustre is running version 1.6.5.1 and mounted with rw,_netdev flags.
The same code run fine on standard ethernet type of storage, such as
NetAPP (i.e. no IB to storage). Also, the code without using MPI-IO, has
no problem to write into lustre.
Thank you very much.
-- Terrence
--------------------------------------------------------
Terrence Liao, Ph.D.
Research Computer Scientist
TOTAL E&P RESEARCH & TECHNOLOGY USA, LLC
1201 Louisiana, Suite 1800, Houston, TX 77002
Tel: 713.647.3498 Fax: 713.647.3638
Email: terrence.liao at total.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20081218/8f320333/attachment.html
More information about the mvapich-discuss
mailing list