[mvapich-discuss] mvapich2 and MPI subjobs
James R. Leek
leek2 at llnl.gov
Fri Apr 15 20:28:18 EDT 2011
On 04/15/2011 03:58 PM, Jonathan Perkins wrote:
> How does mpilaunch work? Is PBS_NODEFILE available on the node that
> mpilaunch is being run on?
The PBS_NODEFILE is certainly available on the node. This program
worked with mvapich-1.2rc1, after all, but also I'm actually running
from a shell opened on the node that mpilaunch is being run on, and I
can read it from that shell.
The code for mpilaunch.c is below. NUM_LAUNCHED can be set higher to
test multiple launches.
t_consumeDeaths runs in a separate thread to clean up the completed
child processes.
After some basic diagnostics an environment variable is set, that the
child will read (to test that)
Then we fork
Then exec mpiexec on the the child mpi job.
This is a a minimized example of what my larger simulation code does
with launch and collection of child mpi jobs. It was origianlly written
this way because it worked well with a SLURM/mvapich0.9.9 system. I'm
now trying to port it to PBSPro/mvapich?-?.?
///////////////////////////////////////////////////////////////////////////
#include "mpi.h"
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <pthread.h>
#include <semaphore.h>
#include <errno.h>
#include <sys/wait.h>
#define NUM_LAUNCHED 1
static sem_t s_child_semaphore;
static void* t_consumeDeaths(void* ignored) {
int status;
int collected = 0;
while(1) {
if(collected >= NUM_LAUNCHED) {
break;
}
sem_wait(&s_child_semaphore);
pid_t pid = wait(&status);
if (pid > 0)
{
printf("Got PID %d\n",pid);
++collected;
}
else
{
// If there was an error, a child was not collected after all;
repost.
sem_post(&s_child_semaphore);
printf("Got error PID %d with errno %d\n", pid, errno);
}
}
return NULL;
}
int main(int argc, char** argv) {
FILE *fOut;
int myRank;
int status;
pthread_t thread;
void* value;
char hostname[256];
int ii, mySocket;
char* pbsFile = NULL;
MPI_Init(&argc, &argv);
MPI_Comm_rank(MPI_COMM_WORLD, &myRank);
MPI_Barrier(MPI_COMM_WORLD);
sem_init(&s_child_semaphore, 0, 0);
pthread_create(&thread, 0, t_consumeDeaths, 0);
gethostname(hostname, 255);
printf("HOSTNAME: %s\n", hostname);
fflush(stdout);
pbsFile = getenv("PBS_NODEFILE");
printf("PBS_NODEFILE: %s\n", pbsFile);
fflush(stdout);
if(pbsFile) {
for(ii = 0; ii < NUM_LAUNCHED; ++ii) {
printf("seting env vars\n");
fflush(stdout);
setenv("COOP_name", "Hi", 1);
printf("env vars all set\n");
fflush(stdout);
pid_t pid = fork(); // FORK FORK FORK
if(pid == 0) {
/*int LOW_SOCKET = 3;
int HIGH_SOCKET = 30;
int mySocket = LOW_SOCKET;
printf("Close Sockets\n");
fflush(stdout);
for (mySocket = LOW_SOCKET; mySocket <= HIGH_SOCKET; ++mySocket) {
close(mySocket);
}
printf("Closed all Sockets\n");
fflush(stdout);
*/
char* args[7] = {"/mnt/home/leek2/mvatest/bin/mpiexec", "-np",
"8", "-machinefile", pbsFile, "/mnt/home/leek2/mpitest2/mpi-helloworld",
'\0'};
execvp("/mnt/home/leek2/mvatest/bin/mpiexec", args);
printf("EXEC FAILED!\n\n");
exit(4);
} else {
printf("Waiting on pid %d\n", pid);
sem_post(&s_child_semaphore);
}
}
} else {
printf("ERROR NOT RUNNING PBS?\n\n");
}
pthread_join(thread, &value);
MPI_Barrier(MPI_COMM_WORLD);
MPI_Finalize();
return 0;
}
--
Jim Leek
leek2 at llnl.gov
More information about the mvapich-discuss
mailing list