[mvapich-discuss] vbuf problem
David Race
drace at appro.com
Mon Sep 15 19:45:43 EDT 2008
Hello,
We are using mvapich2-1.2rc2 with a system that has four mellanox DDR interfaces in each computer and 16 cpus in each computer. When we define
MV2_NUM_HCAS=4
we get a failure in line 230 of vbuf.c which indicates a failure in the following code
for (i = 0; i < rdma_num_hcas; ++i)
{
reg->mem_handle[i] = ibv_reg_mr(
ptag_save[i],
vbuf_dma_buffer,
nvbufs * rdma_vbuf_total_size,
IBV_ACCESS_LOCAL_WRITE | IBV_ACCESS_REMOTE_WRITE);
if (!reg->mem_handle[i])
{
fprintf(stderr, "[%s %d] Cannot register vbuf region\n", __FILE__, __LINE__);
return -1;
}
}
We get this failure in as few as 289 processors, has someone run across this problem before? Is there a suggested set of environment variables that might help prevent the failure?
Thanks
David Race, Ph.D.
Principle Engineer
Appro International, Inc.
25003 Pitkin Road, Suite F600
Spring, TX 77386
Phone: 469-212-4860
Email: drace at appro.com
More information about the mvapich-discuss
mailing list