[mvapich-discuss] mvapich2-1.2p1 program hang

Ting-jen Yen yentj at infowrap.com.tw
Tue May 26 04:03:49 EDT 2009


I have some problem when running MPI programs using mvapich2-1.2p1.

A program is running fine when it is using more than one nodes. 
However, if it is using only one node, (that is, 4 cpu cores or 8 cpu
cores, for example,) it would sometimes stop at MPI_Finalize(), and hang
there forever.  I have tried various programs, even the "hello world"
example, with similar results. (This does not always happen. Sometimes,
the programs run and finish just fine.)

The system is a cluster of more than 80 IBM blade, each with 2 quad 
core Xeon E5355 (that is, 8 CPU core per node).  The OS is RHEL 4
update 4, and the InfiniBand driver used is OFED 1.3.

The mvapich2 was compiled using Intel compiler, with no explicit RDMA
option, which should mean "gen2" by default.

Any idea what could cause this problem?

Thanks,
Ting-jen




More information about the mvapich-discuss mailing list