[mvapich-discuss] segment fault from MPI_Send

吴雪 sy1406125 at buaa.edu.cn
Tue Oct 11 08:36:29 EDT 2016


Hi,all
I'm using MVAPICH2-2.2rc2.I have a program called father and the father process use MPI_Comm_spawn to start 8 children processes called child. Source code are as follows.
father:
#include<mpi.h>
int main(int argc,char **argv)
{
int provided = 0;
MPI_Init(&argc,&argv);
MPI_Info info=MPI_INFO_NULL;
char deviceHosts[10] = "hf";
MPI_Info_create(&info);
MPI_Info_set(info,"hostfile",deviceHosts);
MPI_Comm childComm;
MPI_Comm_spawn("./child",MPI_ARGV_NULL,8,info,0,MPI_COMM_WORLD,&childComm,MPI_ERRCODES_IGNORE);
int size = 64 * 1024;
int i,j;
int *a,*b;
a = (int *)malloc(size * sizeof(int));
b = (int *)malloc(size * sizeof(int));
for(j = 0;j < 500;j ++)
{
for(i = 0;i < 8;i ++)
{
MPI_Send(a,size,MPI_BYTE,i,0,childComm);
MPI_Recv(b,size,MPI_BYTE,i,0,childComm,MPI_STATUS_IGNORE);
}
}
MPI_Finalize();
return 0;
} 
child:
#include<mpi.h>
#include<stdio.h>
int main(int argc,char **argv)
{
int provided = 0;
//MPI_Init_thread(argc,argv,MPI_THREAD_MULTIPLE,&provided);
MPI_Init(&argc,&argv);
int rank;
MPI_Comm fatherComm;
MPI_Comm_rank(MPI_COMM_WORLD,&rank);
printf("child %d start\n",rank);


MPI_Comm_get_parent(&fatherComm);
int size = 64 * 1024;
int i;
int *a,*b;
b = (int *)malloc(size * sizeof(int));
for(i = 0;i < 500;i ++)
{
printf("child %d receive round %d\n",rank,i);
MPI_Recv(b,size,MPI_BYTE,0,0,fatherComm,MPI_STATUS_IGNORE);
MPI_Send(b,size,MPI_BYTE,0,0,fatherComm);
}
printf("child %d exit\n",rank);
MPI_Finalize();
return 0;
} 


the core file is:
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x00007fb379bf2a50 in vma_compare_search () from /home/run/wx-workplace/mvapich2-2.2rc2/lib/libmpi.so.12
(gdb) bt
#0  0x00007fb379bf2a50 in vma_compare_search () from /home/run/wx-workplace/mvapich2-2.2rc2/lib/libmpi.so.12
#1  0x00007fb379c11342 in avl_find () from /home/run/wx-workplace/mvapich2-2.2rc2/lib/libmpi.so.12
#2  0x00007fb379bf311e in dreg_find () from /home/run/wx-workplace/mvapich2-2.2rc2/lib/libmpi.so.12
#3  0x00007fb379bf539a in dreg_register () from /home/run/wx-workplace/mvapich2-2.2rc2/lib/libmpi.so.12
#4  0x00007fb379c0e669 in MPIDI_CH3I_MRAIL_Prepare_rndv () from /home/run/wx-workplace/mvapich2-2.2rc2/lib/libmpi.so.12
#5  0x00007fb379bd63db in MPIDI_CH3_iStartRndvMsg () from /home/run/wx-workplace/mvapich2-2.2rc2/lib/libmpi.so.12
#6  0x00007fb379bd0916 in MPID_MRAIL_RndvSend () from /home/run/wx-workplace/mvapich2-2.2rc2/lib/libmpi.so.12
#7  0x00007fb379bca91d in MPID_Send () from /home/run/wx-workplace/mvapich2-2.2rc2/lib/libmpi.so.12
#8  0x00007fb379b574e5 in PMPI_Send () from /home/run/wx-workplace/mvapich2-2.2rc2/lib/libmpi.so.12
#9  0x0000000000400a5e in main ()


and in file 'hf' is '192.168.2.2:8'. I use mpiexec to launch the job,'mpiexec -genv MV2_SUPPORT_DPM 1 -n 1 ./father'
I've not been able to find out what causes segment fault and how to make it correct. I'll appreciate for any advice.
Looking forward to your reply.


xue




-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20161011/feb35f66/attachment.html>


More information about the mvapich-discuss mailing list