[mvapich-discuss] Mpiexec fails to terminate when program ends
Alex M Warren
amwarren at email.arizona.edu
Mon Nov 3 16:10:22 EST 2014
I am running an mpi program on a cluster. When the program ends the
job does not. And so I have to wait for it to time out.
I am not sure how to debug this. I checked that the program got to the
finalize statement in MPI, and it does. I am using lib Elemental.
Final lines of the program
if (grid.Rank() == 0) std::cout << "Finalize" << std::endl;
Finalize();
mpi::Finalize();
return 0;
(I tried letting elemental do finalize and that didn't work either)
The output will be
Finalize
mpiexec: killall: caught signal 15 (Terminated).
mpiexec: kill_tasks: killing all tasks.
mpiexec: wait_tasks: waiting for taub263.
mpiexec: killall: caught signal 15 (Terminated).
----------------------------------------
Begin Torque Epilogue (Sun Aug 17 01:53:55 2014)
Job ID: ***
Username: ***
Group: ***
Job Name: num_core_compare_nside-32_mpi_nodes-1_cores-2_1e0e4c0516
Session: 16786
Limits:
ncpus=1,neednodes=2:ppn=6:m24G:taub,nodes=2:ppn=6:m24G:taub,walltime=00:13:00
Resources: cput=00:08:17,mem=297884kb,vmem=672648kb,walltime=00:13:13
Job Queue: secondary
Account: ***
Nodes: taub263 taub290
End Torque Epilogue
----------------------------------------
Running these modules on https://campuscluster.illinois.edu/hardware/#taub
> module list
Currently Loaded Modulefiles:
1) torque/4.2.9 4) blas 7) lapack 10) gcc/4.7.1
2) moab/7.2.9 5) mvapich2/1.6-gcc 8) git/1.7 11) cmake/2.8
3) env/taub 6) mvapich2/mpiexec 9) vim/7.3 12)
valgrind/3.9.0
More information about the mvapich-discuss
mailing list