[mvapich-discuss] mvapich2-1.2p1 program hang

Matthew Koop koop at cse.ohio-state.edu
Tue May 26 04:53:30 EDT 2009


Ting-jen,

Can you try added MV2_USE_SRQ=0 when running on a single node? If this
works, then we already have a fix for this issue queued for the next
release of MVAPICH2.

Matt


On Tue, 26 May 2009, Ting-jen Yen wrote:

>
> I have some problem when running MPI programs using mvapich2-1.2p1.
>
> A program is running fine when it is using more than one nodes.
> However, if it is using only one node, (that is, 4 cpu cores or 8 cpu
> cores, for example,) it would sometimes stop at MPI_Finalize(), and hang
> there forever.  I have tried various programs, even the "hello world"
> example, with similar results. (This does not always happen. Sometimes,
> the programs run and finish just fine.)
>
> The system is a cluster of more than 80 IBM blade, each with 2 quad
> core Xeon E5355 (that is, 8 CPU core per node).  The OS is RHEL 4
> update 4, and the InfiniBand driver used is OFED 1.3.
>
> The mvapich2 was compiled using Intel compiler, with no explicit RDMA
> option, which should mean "gen2" by default.
>
> Any idea what could cause this problem?
>
> Thanks,
> Ting-jen
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>



More information about the mvapich-discuss mailing list