[mvapich-discuss] RE: MVAPICH 1.0.0 and stdin

Mark Debbage mark.debbage at qlogic.com
Fri Aug 29 13:45:30 EDT 2008


This is a resend with in-line attachment. Also note that the
problem does not occur with MVAPICH 0.9.9. If I use MVAPICH 1.0.0
and arrange to use the "legacy" start-up mechanism then it also
works reliably. For example:

/usr/mpi/gcc/mvapich-1.0.0/bin/mpirun_rsh -legacy -np 2 -hostfile hosts /home/markdebbage/support/OU/./mpicat < input

This makes me think that the new source code allowing multiple
MPI processes per ssh is the problem, though in this case there
is just one MPI process per node.

Mark.


-----Original Message-----
From: Mark Debbage
Sent: Fri 8/29/2008 10:25 AM
To: mvapich-discuss at cse.ohio-state.edu
Subject: MVAPICH 1.0.0 and stdin
 
We are having problems with stdin and MVAPICH 1.0.0 (from OFED 1.3).
I am running with the mpirun process and rank 0 on the same host
and expecting the stdin of the mpirun process to be available to
rank 0. This works reliably if there is just one process in the job,
or if all MPI processes are mapped to that same host. However, if 
there are MPI processes on other hosts, then stdin becomes 
intermittent - about 4 in 5 times it works fine, but 1 in 5 times
all reads on stdin return EOF.

I've attached the example source code. It is a simple MPI version
of cat. I am building and running like this:

markdebbage at perf-15:~/support/OU> /usr/mpi/gcc/mvapich-1.0.0/bin/mpicc mpicat.c -o mpicat

markdebbage at perf-15:~/support/OU> cat hosts
perf-15
perf-16

Here's a working run:

markdebbage at perf-15:~/support/OU> /usr/mpi/gcc/mvapich-1.0.0/bin/mpirun -machinefile hosts -np 2 ./mpicat < input
This is rank 0 - start loop
1
2
3
4
5
6
999
This is rank 0 - end loop

Here's a non-working run:

markdebbage at perf-15:~/support/OU> /usr/mpi/gcc/mvapich-1.0.0/bin/mpirun -machinefile hosts -np 2 ./mpicat < input
This is rank 0 - start loop
This is rank 0 - end loop
markdebbage at perf-15:~/support/OU> 

I've tried this with OFED 1.3 running on Mellanox and QLogic adapters, 
and also with the PSM version of MVAPICH running on QLogic adapters.
It appears that this is independent of transport. I also tried the
-stdin option that appears on the mpirun help page. However, that
seems to be silently ignored. I can see the code in mpirun.args that
processes that option but it doesn't appear to be connected up to
anything.

Cheers,

Mark.

#include <stdio.h>
#include <stdlib.h>
#include <mpi.h>

int main (int argc, char **argv)
{
        int rank;
        MPI_Init(&argc, &argv);
        MPI_Comm_rank(MPI_COMM_WORLD, &rank);
        if (rank == 0) {
                printf("This is rank 0 - start loop\n");
                int c;
                while ((c = getchar()) != EOF) {
                        putchar(c);
                }
                printf("This is rank 0 - end loop\n");
        }
        MPI_Finalize();
        return EXIT_SUCCESS;
}
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20080829/e7e78a4e/attachment-0001.html


More information about the mvapich-discuss mailing list