[mvapich-discuss] completion with error 12, vendor code=0x81

aleksander at clustervision.com aleksander at clustervision.com
Mon Dec 20 13:31:44 EST 2010


Hi all,

I am getting the following error from mvapich2:

[0<-131] recv desc error, wc_opcode=128
[0->131] wc.status=12, wc.wr_id=0x7cbcb80, wc.opcode=128, 
vbuf->phead->type=0 = MPIDI_CH3_PKT_EAGER_SEND
[221] Abort: [] Got completion with error 12, vendor code=0x81, dest rank=131
at line 607 in file ibv_channel_manager.c As far as I know, error 12 is 
timeout.

mvapich2 version: 1.5.1p1
OFED: 1.5.1-mlnx9
Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0
5GT/s - IB QDR / 10GigE]
Westmere nodes (12 cores in total) 512 cores used for the run
Transport: OFA-IB-CH3

I had a similar error from Intel MPI used with DAPL and defining the 
following environment variables has solved the problem:
	setenv DAPL_ACK_RETRY 7         /* IB RC Ack retry count */
	setenv DAPL_ACK_TIMER 20        /* IB RC Ack retry timer */
The above are taken from the DAPL release notes under "settings for 
large clusters".

What would be the equivalent settings for OFA under mvapich2 and where 
to set them?

Best regards,
Aleksander




More information about the mvapich-discuss mailing list