[mvapich-discuss] On "Got Completion" and IBV_EVENT Errors
Joshua Bernstein
jbernstein at penguincomputing.com
Thu Jan 31 18:55:12 EST 2008
Thank you for your response Matthew,
Matthew Koop wrote:
> Joshua,
>
> So are you able to run `ibv_rc_pingpong' with a variety of message sizes?
> You may want to double-check that the cables between machines are well
> connected as well.
ibv_rc_pingpong seems to work correctly:
[root at flatline ~]# ibv_rc_pingpong -i 2
local address: LID 0x0006, QPN 0x050016, PSN 0x55eeb7
remote address: LID 0x0004, QPN 0x100406, PSN 0x07ccc8
8192000 bytes in 0.04 seconds = 1669.28 Mbit/sec
1000 iters in 0.04 seconds = 39.26 usec/iter
As a side note, it would be nice if there was some description about
what all the ibv_* commands do. For example there is also
ibv_srq_pingpong and ibv_uc_pingpong. If there is some documentation
about this some place that I missed, I apologize.
> With the earlier request you cited, the issue didn't occur for simple
> microbenchmarks, only with an application. We have previously seen issues
> when fork or system calls are used in applications (due to
> incompatibilities with the underlying OpenFabrics drivers).
I'm not quite sure I understand the implications of this. Can you
elaborate? I see the same behavior with the supplied osu_* codes as well.
I should have mentioned this earlier, but we are attempting to move over
a pmgr_client plugin from the vapi transport to the ch_gen2 transport
that uses bproc (Scyld) for job startup instead of RSH. In this code we
do a fork. Some I'd be interested to read your elaboration on this.
Eventually, we (Penguin Computing) hope to be able to contribute this
enhancement up stream.
> It seems that your issue is more likely to be a setup issue. What does
> ulimit -l report on your compute nodes?
It is set to half the available memory on the system, as stated in the
MVAPICH docs.
> Also, it is unlikely that VIADEV_USE_SHMEM_COLL is causing any issue -- turning off this option
> means there is less communication in the init phase (which allows you to
> get to the stdout statements).
no, no, I agree. In fact, my point was using name envar I was able to
get the application to run a bit further.
After a bit of playing around, I've gotten the code to run a bit farther
and now when the cpi program does a MPI_Bcast, I get a hang, and the my
old friend: Got completion with error IBV_WC_RETRY_EXC_ERR,
*Both* processes threads call MPI_Bcast, but only *one* of them sees a
return from MPI_Bcast (n==100) and subsequently calls MPI_Reduce.
-Joshua Bernstein
Penguin Computing
Software Engineer
More information about the mvapich-discuss
mailing list