[mvapich-discuss] RDAM problem

Mingzhe Li limin at cse.ohio-state.edu
Thu Oct 3 14:42:00 EDT 2013


Hi Luis,

Could you please try the run time parameter  MV2_DEFAULT_PUT_GET_LIST_SIZE?
Could you set this parameter to 300 and retry your program? If you use more
number of processes for your program, please increase this parameter
accordingly.

Mingzhe

Dear all,
>
> we have a problem using RDMA and do not have an idea what is going
> wrong.
>
> We have been able to reproduce the problem in a small testcase I
> attached to this email. The example runs fine, if we run on 13 nodes
> with 16 SandyBridge cores, connected via IB.
>
> If we select 16 nodes, it starts failing with messages like this one:
>
> [proxy:0:2 at ctc059] HYD_pmcd_pmip_control_cmd_cb
> (./pm/pmiserv/pmip_cb.c:913): assert (!closed) failed
> [proxy:0:2 at ctc059] HYDT_dmxu_poll_wait_for_event
> (./tools/demux/demux_poll.c:**77): callback returned error status
> [proxy:0:2 at ctc059] main (./pm/pmiserv/pmip.c:206): demux engine error
> waiting for event
>
> on all nodes ...
>
> We would be glad, if someone could help. The attached example contains a
> build script a Makefile,
> jobs for slurm and the results on our machines.
>
> Used mvapich version: mvapich2-1.9b.
>
> Thanks,
> Luis
>
> --
>                              \\\\\\
>                              (-0^0-)
> --------------------------oOO-**-(_)--OOo---------------------**--------
>
>  Luis Kornblueh                           Tel. : +49-40-41173289
>  Max-Planck-Institute for Meteorology     Fax. : +49-40-41173298
>  Bundesstr. 53
>  D-20146 Hamburg                   Email: luis.kornblueh at zmaw.de
>  Federal Republic of Germany
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20131003/1fcb3730/attachment.html


More information about the mvapich-discuss mailing list