[mvapich-discuss] Need advice on Error code =12 problem only when running with MPIIO on lustre

Terrence.LIAO at total.com Terrence.LIAO at total.com
Thu Dec 18 10:56:52 EST 2008


Dear Mvapich-discuss,

I have encountered a very strange  IBV_WC_RETRY_EXC_ERR code=12 problem 
and need your advise.
This problem only happens when using MPI-IO calls such as 
mpi_file_write_all() on lustre.
We are using ofed1.4rc3 on CentOS 5.2.  The IB is infinipath SDR HTX. 
lustre is running version 1.6.5.1 and mounted with rw,_netdev flags.
The same code run fine on standard ethernet  type of storage, such as 
NetAPP (i.e. no IB to storage).  Also,  the code without using MPI-IO, has 
no problem to write into lustre.

Thank you very much.

-- Terrence
--------------------------------------------------------
Terrence Liao, Ph.D.
Research Computer Scientist
TOTAL E&P RESEARCH & TECHNOLOGY USA, LLC
1201 Louisiana, Suite 1800, Houston, TX 77002 
Tel: 713.647.3498  Fax: 713.647.3638
Email: terrence.liao at total.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20081218/8f320333/attachment.html


More information about the mvapich-discuss mailing list