[mvapich-discuss] MPI_Intercomm_merge - segmentation fault

Artur Malinowski artur.malinowski at pg.gda.pl
Thu Jan 21 06:46:15 EST 2016


Hi,

I have a problem while calling MPI_Intercomm_merge, it always ends with 
segmentation fault both in MVAPICH2 2.1 and 2.2. Other MPI functions 
seem to execute without any issues. The same code as below works pretty 
well in Open MPI and MPICH.

Thanks for your help.

Regards,
Artur Malinowski
PhD student at Gdansk University of Technology

----------------------------

sources and output

// manager
int main(int argc, char *argv[]) {
    MPI_Comm inter, intra;
    MPI_Init(&argc, &argv);
    MPI_Comm_spawn("/path/to/worker", MPI_ARGV_NULL, 1,
              MPI_INFO_NULL, 0, MPI_COMM_SELF, &inter,
              MPI_ERRCODES_IGNORE);
    printf("manager: before\n");
    MPI_Intercomm_merge(inter, 0, &intra);
    printf("manager: after\n");
    sleep(10);
    MPI_Finalize();
    return 0;
}

// worker
int main(int argc, char *argv[])
{
    MPI_Comm parent, intra;
    MPI_Init(&argc, &argv);
    MPI_Comm_get_parent(&parent);
    if (parent == MPI_COMM_NULL) error("No parent!");
    printf("worker: before\n");
    MPI_Intercomm_merge(parent, 1, &intra);
    printf("worker: after\n");
    sleep(10);
    MPI_Finalize();
    return 0;
}

output before segfault:

manager: before
worker: before

----------------------------

mpiname -a

MVAPICH2 2.2b Mon Nov 12 20:00:00 EST 2015 ch3:mrail
MVAPICH2 2.2a Mon Aug 17 20:00:00 EDT 2015 ch3:mrail
MVAPICH2 2.1 Fri Apr 03 20:00:00 EDT 2015 ch3:mrail

Configuration
--enable-romio --with-file-system=pvfs2 --with-pvfs2=/path/to/pvfs

----------------------------

command & params

MV2_SUPPORT_DPM=1
mpirun -np 1 /path/to/manager

----------------------------

sys

os: CentOS release 6.5 (Final)
kernel: 2.6.32 and 3.18.0

----------------------------

trace

[lap06:mpi_rank_0][error_sighandler] Caught error: Segmentation fault 
(signal 11)
[lap06:mpi_rank_0][error_sighandler] Caught error: Segmentation fault 
(signal 11)
[lap06:mpi_rank_0][print_backtrace]   0: 
/malinowski/mvapich2/lib/libmpi.so.12(print_backtrace+0x1e) [0x7f78a0af7f4e]
[lap06:mpi_rank_0][print_backtrace]   1: 
/malinowski/mvapich2/lib/libmpi.so.12(error_sighandler+0x59) 
[0x7f78a0af8059]
[lap06:mpi_rank_0][print_backtrace]   2: /lib64/libc.so.6() [0x3a076326a0]
[lap06:mpi_rank_0][print_backtrace]   3: 
/malinowski/mvapich2/lib/libmpi.so.12(+0x33d770) [0x7f78a0aea770]
[lap06:mpi_rank_0][print_backtrace]   4: 
/malinowski/mvapich2/lib/libmpi.so.12(+0x33ddd5) [0x7f78a0aeadd5]
[lap06:mpi_rank_0][print_backtrace]   5: 
/malinowski/mvapich2/lib/libmpi.so.12(MPIDI_CH3I_CM_Connect+0x17c) 
[0x7f78a0aeb53c]
[lap06:mpi_rank_0][print_backtrace]   6: 
/malinowski/mvapich2/lib/libmpi.so.12(MPIDI_CH3_iSendv+0x2ec) 
[0x7f78a0ab254c]
[lap06:mpi_rank_0][print_backtrace]   7: 
/malinowski/mvapich2/lib/libmpi.so.12(MPIDI_CH3_EagerContigIsend+0xb8) 
[0x7f78a0aa2348]
[lap06:mpi_rank_0][print_backtrace]   8: 
/malinowski/mvapich2/lib/libmpi.so.12(MPID_Isend+0x2de) [0x7f78a0aa7d5e]
[lap06:mpi_rank_0][print_backtrace]   9: 
/malinowski/mvapich2/lib/libmpi.so.12(MPIC_Sendrecv+0x14e) [0x7f78a0a607de]
[lap06:mpi_rank_0][print_backtrace]  10: 
/malinowski/mvapich2/lib/libmpi.so.12(MPIR_Intercomm_merge_impl+0x209) 
[0x7f78a0a055c9]
[lap06:mpi_rank_0][print_backtrace]  11: 
/malinowski/mvapich2/lib/libmpi.so.12(PMPI_Intercomm_merge+0x370) 
[0x7f78a0a05c50]
[lap06:mpi_rank_0][print_backtrace]  12: /malinowski/pmem2/worker() 
[0x400818]
[lap06:mpi_rank_0][print_backtrace]  13: 
/lib64/libc.so.6(__libc_start_main+0xfd) [0x3a0761ed5d]
[lap06:mpi_rank_0][print_backtrace]  14: /malinowski/pmem2/worker() 
[0x4006f9]
[lap06:mpi_rank_0][print_backtrace]   0: 
/malinowski/mvapich2/lib/libmpi.so.12(print_backtrace+0x1e) [0x7feba435af4e]
[lap06:mpi_rank_0][print_backtrace]   1: 
/malinowski/mvapich2/lib/libmpi.so.12(error_sighandler+0x59) 
[0x7feba435b059]
[lap06:mpi_rank_0][print_backtrace]   2: /lib64/libc.so.6() [0x3a076326a0]
[lap06:mpi_rank_0][print_backtrace]   3: 
/malinowski/mvapich2/lib/libmpi.so.12(+0x33d770) [0x7feba434d770]
[lap06:mpi_rank_0][print_backtrace]   4: 
/malinowski/mvapich2/lib/libmpi.so.12(+0x33ddd5) [0x7feba434ddd5]
[lap06:mpi_rank_0][print_backtrace]   5: 
/malinowski/mvapich2/lib/libmpi.so.12(MPIDI_CH3I_CM_Connect+0x17c) 
[0x7feba434e53c]
[lap06:mpi_rank_0][print_backtrace]   6: 
/malinowski/mvapich2/lib/libmpi.so.12(MPIDI_CH3_iSendv+0x2ec) 
[0x7feba431554c]
[lap06:mpi_rank_0][print_backtrace]   7: 
/malinowski/mvapich2/lib/libmpi.so.12(MPIDI_CH3_EagerContigIsend+0xb8) 
[0x7feba4305348]
[lap06:mpi_rank_0][print_backtrace]   8: 
/malinowski/mvapich2/lib/libmpi.so.12(MPID_Isend+0x2de) [0x7feba430ad5e]
[lap06:mpi_rank_0][print_backtrace]   9: 
/malinowski/mvapich2/lib/libmpi.so.12(MPIC_Sendrecv+0x14e) [0x7feba42c37de]
[lap06:mpi_rank_0][print_backtrace]  10: 
/malinowski/mvapich2/lib/libmpi.so.12(MPIR_Intercomm_merge_impl+0x209) 
[0x7feba42685c9]
[lap06:mpi_rank_0][print_backtrace]  11: 
/malinowski/mvapich2/lib/libmpi.so.12(PMPI_Intercomm_merge+0x370) 
[0x7feba4268c50]
[lap06:mpi_rank_0][print_backtrace]  12: ./manager() [0x4007d9]
[lap06:mpi_rank_0][print_backtrace]  13: 
/lib64/libc.so.6(__libc_start_main+0xfd) [0x3a0761ed5d]
[lap06:mpi_rank_0][print_backtrace]  14: ./manager() [0x4006a9]

===================================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 120027 RUNNING AT lap06
=   EXIT CODE: 11
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
[proxy:1:0 at lap06] HYDU_sock_write (utils/sock/sock.c:286): write error 
(Broken pipe)
[proxy:1:0 at lap06] main (pm/pmiserv/pmip.c:272): unable to return exit 
status upstream
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault 
(signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions



More information about the mvapich-discuss mailing list