[mvapich-discuss] Can't run jobs on multiple nodes

Thu Aug 16 12:54:04 EDT 2012

Hi Jonathan

Thanks so much for the quick reply and kind help! Here is what I found:

1.I tested osu_latency between compute-0-3 and compute-04 and it seems OK. Detailed results:
[testuser@*** osu-micro-benchmarks]$ mpirun_rsh -np 2 compute-0-3 compute-0-4 ./osu_latency
# OSU MPI Latency Test v3.6
# Size Latency (us)
0 1.27
1 1.36
2 1.41
4 1.33
8 1.30
16 1.29
32 1.31
64 1.34
128 1.49
256 2.16
512 2.37
1024 2.81
2048 3.70
4096 4.44
8192 6.29
16384 10.44
32768 14.72
65536 23.38
131072 40.67
262144 75.96
524288 145.59
1048576 288.11
2097152 565.41
4194304 1124.86

2. I typed in ulimit -l and ulimit -a on both compute-0-3 and compute-0-4, and here is the results:

[testuser at compute-0-3 ~]$ ulimit -l
unlimited
[testuser at compute-0-3 ~]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 257560
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 257560
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

The information from the other node (compute-0-4) is exactly the same:

[testuser at compute-0-4 ~]$ ulimit -l
unlimited
[testuser at compute-0-4 ~]$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 257560
max locked memory (kbytes, -l) unlimited
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 257560
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited

3. Thanks for the hint to use a debug build. Another silly question (excuse the new hand: ) : If I want to have a debug build and use GDB to find what's going on, do I need to uninstall the current mvapich2 and re-configure & make & make install it? I'm happy to do it, but just want to confirm. 

Thanks so much for the help! Any comment/suggestions/questions are truly appreciated.

--
Sincerely, 
Xing Wang

Graduate Student 
Department of Engineering Physics 
UW-Madison
1509 University Ave.
Madison, WI, 53706 

On 12/08/16, Jonathan Perkins 
 wrote:
> Hello, let's try seeing if a simple case works.
> 
> Does something basic like osu_latency work between two nodes? What does
> ulimit -l show when run on the two nodes?
> 
> Also, a debug build of mvapich2 should provide more information in this
> error case. In addition to --enable-g=dbg, I suggest adding
> --disable-fast.
> 
> http://mvapich.cse.ohio-state.edu/support/user_guide_mvapich2-1.8.html#x1-1120009.1.10
> 
> On Thu, Aug 16, 2012 at 10:12:53AM -0500, Xing Wang wrote:
> > Hi All,
> > 
> > Thanks for reading the email. Currently I'm working on a new 44-nodes cluster. I guess my question is silly one but since I'm new to linux/mvapich2, your help/comment would be very helpful to me and sincerely appreciated.
> > 
> > 
> > Problem situation:
> > 
> > 
> > We want to run LAMMPS (a parallel computing software) on the new cluster. The MPI implementation is MVAPICH2-1.8 and batch-queuing system is Oracle Grid Engine (GE) 6.2u5. I've set up a queue and assign 2 compute nodes (compute-0-3 and compute-0-4, each node has 24 processors) to it. Before run LAMMPS, I tested MVAPICH2 and Grid Engine by submitting simple parallel script (free -m, inquire the memory on multiple nodes), it works very well. 
> > 
> > 
> > Then I installed and run LAMMPS as a cluster user. If I run the jobs on multiple processors within a single node, it works very well. However, if I expand the job to two nodes (i.e. I require more than 24 nodes in the parallel submitting scripts), it got stuck and a error message appear as follows:
> > 
> > 
> > -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > [cli_35]: aborting job:
> > Fatal error in MPI_Init:
> > Other MPI error
> > [proxy:0:0 at compute-0-4.local] HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:955): assert (!closed) failed
> > [proxy:0:0 at compute-0-4.local] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
> > [proxy:0:0 at compute-0-4.local] main (./pm/pmiserv/pmip.c:226): demux engine error waiting for event
> > [mpiexec at compute-0-4.local] HYDT_bscu_wait_for_completion (./tools/bootstrap/utils/bscu_wait.c:69): one of the processes terminated badly; aborting
> > [mpiexec at compute-0-4.local] HYDT_bsci_wait_for_completion (./tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for completion
> > [mpiexec at compute-0-4.local] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:191): launcher returned error waiting for completion
> > [mpiexec at compute-0-4.local] main (./ui/mpich/mpiexec.c:405): process manager error waiting for completion 
> > -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
> > 
> > Does anyone has similar experiences before? Your comment/help/suggestions would be really helpful. 
> > 
> > 
> > Here is more information in case of need:
> > 
> > 
> > 
> > 
> > 1.The parallel pe:
> > pe_name mvapich2_test
> > slots 9999
> > user_lists NONE
> > xuser_lists NONE
> > start_proc_args /opt/gridengine/mpi/startmpi.sh $pe_hostfile
> > stop_proc_args NONE
> > allocation_rule $fill_up
> > control_slaves TRUE
> > job_is_first_task FALSE
> > urgency_slots min
> > accounting_summary FALSE
> > 
> > 
> > 2. The queue set up:
> > qname Ltest.q
> > hostlist @LAMMPShosts
> > seq_no 0
> > load_thresholds np_load_avg=3.75
> > suspend_thresholds NONE
> > nsuspend 1
> > suspend_interval 00:05:00
> > priority 0
> > min_cpu_interval 00:05:00
> > processors UNDEFINED
> > qtype BATCH INTERACTIVE
> > ckpt_list NONE
> > pe_list make mpich mpi orte mvapich2_test
> > rerun FALSE
> > slots 6,[compute-0-3.local=24],[compute-0-4.local=24]
> > tmpdir /tmp
> > shell /bin/bash
> > prolog NONE
> > epilog NONE
> > shell_start_mode posix_compliant
> > starter_method NONE
> > suspend_method NONE
> > resume_method NONE
> > terminate_method NONE
> > notify 00:00:60
> > owner_list NONE
> > user_lists NONE
> > xuser_lists NONE
> > subordinate_list NONE
> > complex_values NONE
> > projects NONE
> > xprojects NONE
> > calendar NONE
> > initial_state default
> > s_rt INFINITY
> > h_rt INFINITY
> > s_cpu INFINITY
> > h_cpu INFINITY
> > s_fsize INFINITY
> > h_fsize INFINITY
> > s_data INFINITY
> > h_data INFINITY
> > s_stack INFINITY
> > h_stack INFINITY
> > s_core INFINITY
> > h_core INFINITY
> > 
> > 
> > 
> > 
> > 
> > 3. The host file @LAMMPShosts:
> > 
> > 
> > # qconf -shgrp @LAMMPShosts
> > group_name @LAMMPShosts
> > hostlist compute-0-3.local compute-0-4.local
> > 
> > 
> > 
> > 
> > 
> > 
> > 
> > 4. The submitting script:
> > #!/bin/bash
> > #$ -N Lammps_test
> > 
> > 
> > # request the queue for this job
> > # for VASP test, replace <queue_name> with Vtest.q
> > # for LAMMPS test, repalce <queue_name> with Ltest.q
> > #$ -q Ltest.q
> > 
> > 
> > # request computational resources for this job as follows
> > # replace <num> below with the number of CPUs for the job
> > # For Vtest.q, <num>=0~48; fro Ltest.q, <num>=0~48 
> > #$ -pe mvapich2_test 36
> > 
> > 
> > # request wall time (max is 96:00:00)
> > #$ -l h_rt=48:00:00
> > 
> > 
> > # run the job from the directory of submission.Uncomment only if you don't want the defults.
> > #$ -cwd
> > # combine SGE standard output and error files
> > #$ -o $JOB_NAME.o$JOB_ID
> > #$ -e $JOB_NAME.e$JOB_ID
> > # transfer all your environment variables. Uncomment only if you don't want the defults
> > #$ -V
> > 
> > 
> > # Use full pathname to make sure we are using the right mpi
> > MPI_HOME=/share/apps/mvapich2/1.8/intel_Composer_XE_12.2.137/bin
> > ## $MPI_HOME/mpiexec -n $NSLOTS lammps-20Aug12/src/lmp_linux < in.poly > out.poly
> > $MPI_HOME/mpiexec -n $NSLOTS lammps-20Aug12/src/lmp_linux < lammps-20Aug12/examples/crack/in.crack > out.crack
> > 
> > 
> > 
> > --
> > Sincerely, 
> > Xing Wang
> > 
> > Graduate Student 
> > Department of Engineering Physics 
> > UW-Madison
> > Madison, WI, 53706
> > _______________________________________________
> > mvapich-discuss mailing list
> > mvapich-discuss at cse.ohio-state.edu
> > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> > 
> 
> -- 
> Jonathan Perkins
> http://www.cse.ohio-state.edu/~perkinjo