[mvapich-discuss] Weird results if I Cut-off network during
execution
Rui Wang
wangraying at gmail.com
Tue Feb 14 03:32:39 EST 2012
Hi all,
I came across an interesting problem. I cut off the IB connection of two
processes by modifying IB IP address, but it seems the receiver still
successfully got the data from the sender.
The test sample I used is as follows.
#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
#include "mpi.h"
#define COUNT 1000
double buf[COUNT];
int main(int argc, char ** argv)
{
int rank, tag = 99;
int i, pid, rc;
MPI_Status status;
char hostname[20];
int len;
char errstr[MPI_MAX_ERROR_STRING];
int errlen;
int off1, off2, size1, size2;
MPI_Init( &argc, &argv);
MPI_Comm_set_errhandler(MPI_COMM_WORLD, MPI_ERRORS_RETURN);
MPI_Comm_rank( MPI_COMM_WORLD, &rank);
gethostname(hostname, &len);
printf("I am P%d, pid = %d, at %s\n", rank, getpid(), hostname);
sleep(10); /*modify ip address of node0 during this interval;
node0 is the node where P0 is running on*/
off1 = 0;
off2 = COUNT / 2;
size1 = COUNT / 2;
size2 = COUNT - COUNT / 2;
if( rank == 0 )
{
for( i = off1; i < off1 + size1; i++ )
buf[i] = 0.8 * (i * i - buf[i-1]);
rc = MPI_Send(buf + off1, size1, MPI_DOUBLE, 2, tag,
MPI_COMM_WORLD); /*P0 send data to P2*/
MPI_Error_string(rc, errstr, &errlen);
printf("P%d: MPI_Send rc = %d %s\n", rank, rc, errstr);
}
else if( rank == 1 )
{
for( i = off2; i < off2 + size2; i++ )
buf[i] = 8.9 * i - i / 2;
rc = MPI_Send(buf + off2, size2, MPI_DOUBLE, 2, tag,
MPI_COMM_WORLD); /*P1 send data to P2*/
MPI_Error_string(rc, errstr, &errlen);
printf("P%d: MPI_Send rc = %d %s\n", rank, rc, errstr);
}
else if( rank == 2 )
{
rc = MPI_Recv(buf + off1, size1, MPI_DOUBLE, 0, tag, MPI_COMM_WORLD,
&status); /*P2 receive data from P0*/
MPI_Error_string(rc, errstr, &errlen);
printf("P%d: MPI_Recv rc = %d %s\n", rank, rc, errstr);
rc = MPI_Recv(buf + off2, size2, MPI_DOUBLE, 1, tag, MPI_COMM_WORLD,
&status); /*P2 receive data from P1*/
MPI_Error_string(rc, errstr, &errlen);
printf("P%d: MPI_Recv rc = %d %s\n", rank, rc, errstr);
}
MPI_Finalize();
return 1;
}
And the result is:
[*@*]$ mpiexec -f ifile -np 4 -disable-auto-cleanup ./ip_sample
I am P1, pid = 942, at gnode103
I am P0, pid = 1831, at gnode102
I am P2, pid = 943, at gnode103
I am P3, pid = 944, at gnode103
P1: MPI_Send rc = 0 No MPI error
P2: MPI_Recv rc = 0 No MPI error
P2: MPI_Recv rc = 0 No MPI error
It's a little weird that P2 still successfully received data from P0 after
the connection between the two processes is cut-off. Does Mvapich2 have some
optimizations on it?
Thanks,
Rui
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20120214/14e5fc99/attachment-0001.html
More information about the mvapich-discuss
mailing list