[mvapich-discuss] problems in executing higher number process job

Sangamesh B forum.san at gmail.com
Mon Aug 18 02:36:24 EDT 2008


 Dear all,

Problem No 1:

Application: GROMACS 3.3.3

Parallel Library: MVAPICH2-1.0.3

Compilers: Intel C++ and Fortran 10

  A parallel Gromacs-3.3.3(C application) 32 core job runs successfully on a
Rocks 4.3, 33
node cluster ( Dual processor, Quad core Intel Xeon: Total 264 cores ).

But if I submit same job for 64 or higher no of processes, it  comes without
doing
anything.

This is my command line:

grompp_mpi -np 64 -f run.mdp -p topol.top -c pr.gro -o run.tpr
mpirun -machinefile ./machfile1 -np 64 mdrun_mpi -v -deffnm run



Problem No 2:

Application: NAMD 2.6

Parallel Library: MVAPICH2-1.0.3

Compilers: Intel C++ and Fortran 10

I built successfully charm++ with mvapich2 and intel compilers, and then
compiled NAMD2.

The test examples given in the NAMD distribution works fine.

With the following input file( This input file is the one which is used in
the NAMD website, for benchmarking. It runs/scales upto 252 processes as
mentioned in NAMD website). But in my case it runs only for 8 process, 16
process, 32 process, 64 processes.

But when a 128 core job submitted, it doesn't run at all. The following is
the command and error.

#mpirun -machinefile ./machfile -np 128
/data/apps/namd26_mvapich2/Linux-mvapich2/namd2 ./apoa1.namd | tee
namd_128cores
Charm++> Running on MPI version: 2.0 multi-thread support: 0/0
rank 65 in job 4  master_host_name_50238   caused collective abort of all
ranks
  exit status of rank 65: killed by signal 9


So, in further, I built charmc with network version of charm++ library
without using mvapich2. Now it works for any number process job.

So, for the above two problems, I guess there is some thing problem with
mvapich2 itself.  Is there a solution for it?


Regards,
Sangamesh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20080818/a3157798/attachment.html


More information about the mvapich-discuss mailing list