[mvapich-discuss] Can't run jobs on multiple nodes

Thu Aug 16 11:12:53 EDT 2012

Hi All,

Thanks for reading the email. Currently I'm working on a new 44-nodes cluster. I guess my question is silly one but since I'm new to linux/mvapich2, your help/comment would be very helpful to me and sincerely appreciated.

Problem situation:

We want to run LAMMPS (a parallel computing software) on the new cluster. The MPI implementation is MVAPICH2-1.8 and batch-queuing system is Oracle Grid Engine (GE) 6.2u5. I've set up a queue and assign 2 compute nodes (compute-0-3 and compute-0-4, each node has 24 processors) to it. Before run LAMMPS, I tested MVAPICH2 and Grid Engine by submitting simple parallel script (free -m, inquire the memory on multiple nodes), it works very well. 

Then I installed and run LAMMPS as a cluster user. If I run the jobs on multiple processors within a single node, it works very well. However, if I expand the job to two nodes (i.e. I require more than 24 nodes in the parallel submitting scripts), it got stuck and a error message appear as follows:

-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
[cli_35]: aborting job:
Fatal error in MPI_Init:
Other MPI error
[proxy:0:0 at compute-0-4.local] HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:955): assert (!closed) failed
[proxy:0:0 at compute-0-4.local] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
[proxy:0:0 at compute-0-4.local] main (./pm/pmiserv/pmip.c:226): demux engine error waiting for event
[mpiexec at compute-0-4.local] HYDT_bscu_wait_for_completion (./tools/bootstrap/utils/bscu_wait.c:69): one of the processes terminated badly; aborting
[mpiexec at compute-0-4.local] HYDT_bsci_wait_for_completion (./tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for completion
[mpiexec at compute-0-4.local] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:191): launcher returned error waiting for completion
[mpiexec at compute-0-4.local] main (./ui/mpich/mpiexec.c:405): process manager error waiting for completion 
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Does anyone has similar experiences before? Your comment/help/suggestions would be really helpful. 

Here is more information in case of need:

1.The parallel pe:
pe_name mvapich2_test
slots 9999
user_lists NONE
xuser_lists NONE
start_proc_args /opt/gridengine/mpi/startmpi.sh $pe_hostfile
stop_proc_args NONE
allocation_rule $fill_up
control_slaves TRUE
job_is_first_task FALSE
urgency_slots min
accounting_summary FALSE

2. The queue set up:
qname Ltest.q
hostlist @LAMMPShosts
seq_no 0
load_thresholds np_load_avg=3.75
suspend_thresholds NONE
nsuspend 1
suspend_interval 00:05:00
priority 0
min_cpu_interval 00:05:00
processors UNDEFINED
qtype BATCH INTERACTIVE
ckpt_list NONE
pe_list make mpich mpi orte mvapich2_test
rerun FALSE
slots 6,[compute-0-3.local=24],[compute-0-4.local=24]
tmpdir /tmp
shell /bin/bash
prolog NONE
epilog NONE
shell_start_mode posix_compliant
starter_method NONE
suspend_method NONE
resume_method NONE
terminate_method NONE
notify 00:00:60
owner_list NONE
user_lists NONE
xuser_lists NONE
subordinate_list NONE
complex_values NONE
projects NONE
xprojects NONE
calendar NONE
initial_state default
s_rt INFINITY
h_rt INFINITY
s_cpu INFINITY
h_cpu INFINITY
s_fsize INFINITY
h_fsize INFINITY
s_data INFINITY
h_data INFINITY
s_stack INFINITY
h_stack INFINITY
s_core INFINITY
h_core INFINITY

3. The host file @LAMMPShosts:

# qconf -shgrp @LAMMPShosts
group_name @LAMMPShosts
hostlist compute-0-3.local compute-0-4.local

4. The submitting script:
#!/bin/bash
#$ -N Lammps_test

# request the queue for this job
# for VASP test, replace <queue_name> with Vtest.q
# for LAMMPS test, repalce <queue_name> with Ltest.q
#$ -q Ltest.q

# request computational resources for this job as follows
# replace <num> below with the number of CPUs for the job
# For Vtest.q, <num>=0~48; fro Ltest.q, <num>=0~48 
#$ -pe mvapich2_test 36

# request wall time (max is 96:00:00)
#$ -l h_rt=48:00:00

# run the job from the directory of submission.Uncomment only if you don't want the defults.
#$ -cwd
# combine SGE standard output and error files
#$ -o $JOB_NAME.o$JOB_ID
#$ -e $JOB_NAME.e$JOB_ID
# transfer all your environment variables. Uncomment only if you don't want the defults
#$ -V

# Use full pathname to make sure we are using the right mpi
MPI_HOME=/share/apps/mvapich2/1.8/intel_Composer_XE_12.2.137/bin
## $MPI_HOME/mpiexec -n $NSLOTS lammps-20Aug12/src/lmp_linux < in.poly > out.poly
$MPI_HOME/mpiexec -n $NSLOTS lammps-20Aug12/src/lmp_linux < lammps-20Aug12/examples/crack/in.crack > out.crack

--
Sincerely, 
Xing Wang

Graduate Student 
Department of Engineering Physics 
UW-Madison
Madison, WI, 53706