[mvapich-discuss] program hanged using mvapich with large number
of processes
Dhabaleswar Panda
panda at cse.ohio-state.edu
Fri Jan 22 16:52:04 EST 2010
Can you tell us the MVAPICH2 version you are using. Also, can you tell us
the IB adapter type used in your system.
Thanks,
DK
On Fri, 22 Jan 2010, Weimin Wang wrote:
> Hello, list,
>
> I have got a strange problem with mvapich2. For cpi example, when I run it
> with small number of processes, it is OK:
>
> wmwang at node32:~/test> mpirun_rsh -ssh -np 2 -hostfile ./ma ./cpi
> Process 0 on node32
> Process 1 on node32
> pi is approximately 3.1416009869231241, Error is 0.0000083333333309
> wall clock time = 0.000174
>
> wmwang at node32:~/test> mpirun_rsh -ssh -np 10 -hostfile ./ma ./cpi
> Process 8 on node33
> pi is approximately 3.1416009869231249, Error is 0.0000083333333318
> wall clock time = 0.000127
> Process 1 on node32
> Process 3 on node32
> Process 0 on node32
> Process 4 on node32
> Process 2 on node32
> Process 6 on node32
> Process 5 on node32
> Process 7 on node32
> Process 9 on node33
> However, when I run cpi with large number processes, the program hangs with
> no output:
>
> wmwang at node32:~/test> mpirun_rsh -ssh -np 18 -hostfile ./ma ./cpi
>
> And top command in node32 show that,
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 14507 wmwang 15 0 60336 50m 676 S 56 0.2 0:03.86 mpispawn
> The system I used is,
>
> wmwang at node33:~> uname -a
> Linux node33 2.6.16.60-0.42.4_lustre.1.8.1.1-smp #1 SMP Fri Aug 14 08:33:26
> MDT 2009 x86_64 x86_64 x86_64 GNU/Linux
> The compiler is pgi v10.0.
>
> Would you please give me any hint for this problem?
>
More information about the mvapich-discuss
mailing list