[mvapich-discuss] Mpiexec fails to terminate when program ends

Alex M Warren amwarren at email.arizona.edu
Mon Nov 3 16:10:22 EST 2014


I am running an mpi program on a cluster. When the program ends the
job does not. And so I have to wait for it to time out.

I am not sure how to debug this. I checked that the program got to the
finalize statement in MPI, and it does. I am using lib Elemental.

Final lines of the program


if (grid.Rank() == 0) std::cout << "Finalize" << std::endl;

Finalize();
mpi::Finalize();
return 0;

(I tried letting elemental do finalize and that didn't work either)
The output will be

Finalize
mpiexec: killall: caught signal 15 (Terminated).
mpiexec: kill_tasks: killing all tasks.
mpiexec: wait_tasks: waiting for taub263.
mpiexec: killall: caught signal 15 (Terminated).
----------------------------------------
Begin Torque Epilogue (Sun Aug 17 01:53:55 2014)
Job ID:           ***
Username:         ***
Group:            ***
Job Name:         num_core_compare_nside-32_mpi_nodes-1_cores-2_1e0e4c0516
Session:          16786
Limits:
ncpus=1,neednodes=2:ppn=6:m24G:taub,nodes=2:ppn=6:m24G:taub,walltime=00:13:00
Resources:        cput=00:08:17,mem=297884kb,vmem=672648kb,walltime=00:13:13
Job Queue:        secondary
Account:          ***
Nodes:            taub263 taub290
End Torque Epilogue
----------------------------------------

Running these modules on https://campuscluster.illinois.edu/hardware/#taub

> module list
Currently Loaded Modulefiles:
  1) torque/4.2.9       4) blas               7) lapack            10) gcc/4.7.1
  2) moab/7.2.9         5) mvapich2/1.6-gcc   8) git/1.7           11) cmake/2.8
 3) env/taub           6) mvapich2/mpiexec   9) vim/7.3           12)
valgrind/3.9.0


More information about the mvapich-discuss mailing list