[mvapich-discuss] rdma_iba_priv.c + Error posting recv
Vishwas Vasisht
vvasisht at locuz.com
Mon Nov 6 07:44:33 EST 2006
Hi,
I have 65 nodes Opetron cluster, with total of 260 cores(64 nodes + 1 Master, each dual processor, dual cored)
I was trying to submit a job (cpi, jobfarming..), using -np to be greater than 260. It was working till -np 300. But for above 300, I am getting these errors several times.
--------------------------------------------------------------------------
[rdma_iba_priv.c:406] error(-236): Error posting recv!
rank 12 in job 7 masternode_33851 caused collective abort of all ranks
exit status of rank 12: killed by signal 9
--------------------------------------------------------------------------
Can you please help me sorting this out.
Regards
Vishwas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20061106/c91a286b/attachment.html
More information about the mvapich-discuss
mailing list