[mvapich-discuss] qdel doesn't work with grid engine & mvapich2

Xing Wang xwang348 at wisc.edu
Fri Sep 28 17:07:33 EDT 2012


Hi all,

Thanks for reading the email. 
I'm running grid engine 6.2u5 with mvapich2_1.6.1-p1. We meet with a problem about "qdel" and sincerely wish for your kind help!


The "qdel" command could only delete the jobs ID in the queue, but couldn't clean up the process in the nodes, which means the "deleted" jobs would keep on running in the compute nodes and finally slow down the calculation speed. However, if the jobs finish by themselves without "qdel", there is no such problems.


I noticed there might be a problem of "tight" and "loose" integration of mvapich2. Could it be the reason here? Your comment/advice/help would be highly appreciated.
(We tried Mvapich2_1.8. However this version has some problems to assign jobs to multiple nodes. So we have to use mvapich2_1.6 here.)


Here are some technical details:
I. Hardware: 
1. Xeon(R) CPU E5-2620 0 @ 2.00GHz (Module#: 45)  
2. IB adapter: Mellanox Technologies MT 26428.


II. Software
1. OS: Rocks 6.0.2 (CentOS6.2)
2. Compiler: intel Fortran & C++ Composer XE 2011
3. MPI: mvapich2_1.6.1-p1
3. Queue: grid engine 6.2u5


III. Scripts:


#!/bin/bash
#$ -N your_jobname
#$ -q <queue_name>
#$ -pe mvapich2 <process_num>
#$ -l h_rt=48:00:00
#$ -cwd
# combine SGE standard output and error files
#$ -o $JOB_NAME.o$JOB_ID
#$ -e $JOB_NAME.e$JOB_ID
#$ -V
echo "Got $NSLOTS processors."
MPI_HOME=/share/apps/mvapich2/1.6.1-p1/bin
$MPI_HOME/mpirun_rsh -hostfile $TMPDIR/machines -n $NSLOTS <command name> <command args>



Thanks for the help!
--
Sincerely, 
WANG, Xing

Graduate Student 
Department of Engineering Physics & 
Nuclear Engineering, UW-Madison
1509 University Ave.
Madison, WI, 53706
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20120928/f0b27738/attachment.html


More information about the mvapich-discuss mailing list