[mvapich-discuss] (no subject)

Hari Subramoni subramoni.1 at osu.edu
Thu Jan 21 15:11:07 EST 2016


Hello Artur,

Thanks for the report. We're taking a look at the issue. We will get back
to you once we've a resolution.

Thx,
Hari.

On Thu, Jan 21, 2016 at 6:46 AM, Artur Malinowski <
artur.malinowski at pg.gda.pl> wrote:

> Hi,
>
> I have a problem while calling MPI_Intercomm_merge, it always ends with
> segmentation fault both in MVAPICH2 2.1 and 2.2. Other MPI functions
> seem to execute without any issues. The same code as below works pretty
> well in Open MPI and MPICH.
>
> Thanks for your help.
>
> Regards,
> Artur Malinowski
> PhD student at Gdansk University of Technology
>
> ----------------------------
>
> sources and output
>
> // manager
> int main(int argc, char *argv[]) {
>     MPI_Comm inter, intra;
>     MPI_Init(&argc, &argv);
>     MPI_Comm_spawn("/path/to/worker", MPI_ARGV_NULL, 1,
>               MPI_INFO_NULL, 0, MPI_COMM_SELF, &inter,
>               MPI_ERRCODES_IGNORE);
>     printf("manager: before\n");
>     MPI_Intercomm_merge(inter, 0, &intra);
>     printf("manager: after\n");
>     sleep(10);
>     MPI_Finalize();
>     return 0;
> }
>
> // worker
> int main(int argc, char *argv[])
> {
>     MPI_Comm parent, intra;
>     MPI_Init(&argc, &argv);
>     MPI_Comm_get_parent(&parent);
>     if (parent == MPI_COMM_NULL) error("No parent!");
>     printf("worker: before\n");
>     MPI_Intercomm_merge(parent, 1, &intra);
>     printf("worker: after\n");
>     sleep(10);
>     MPI_Finalize();
>     return 0;
> }
>
> output before segfault:
>
> manager: before
> worker: before
>
> ----------------------------
>
> mpiname -a
>
> MVAPICH2 2.2b Mon Nov 12 20:00:00 EST 2015 ch3:mrail
> MVAPICH2 2.2a Mon Aug 17 20:00:00 EDT 2015 ch3:mrail
> MVAPICH2 2.1 Fri Apr 03 20:00:00 EDT 2015 ch3:mrail
>
> Configuration
> --enable-romio --with-file-system=pvfs2 --with-pvfs2=/path/to/pvfs
>
> ----------------------------
>
> command & params
>
> MV2_SUPPORT_DPM=1
> mpirun -np 1 /path/to/manager
>
> ----------------------------
>
> sys
>
> os: CentOS release 6.5 (Final)
> kernel: 2.6.32 and 3.18.0
>
> ----------------------------
>
> trace
>
> [lap06:mpi_rank_0][error_sighandler] Caught error: Segmentation fault
> (signal 11)
> [lap06:mpi_rank_0][error_sighandler] Caught error: Segmentation fault
> (signal 11)
> [lap06:mpi_rank_0][print_backtrace]   0:
> /malinowski/mvapich2/lib/libmpi.so.12(print_backtrace+0x1e)
> [0x7f78a0af7f4e]
> [lap06:mpi_rank_0][print_backtrace]   1:
> /malinowski/mvapich2/lib/libmpi.so.12(error_sighandler+0x59)
> [0x7f78a0af8059]
> [lap06:mpi_rank_0][print_backtrace]   2: /lib64/libc.so.6() [0x3a076326a0]
> [lap06:mpi_rank_0][print_backtrace]   3:
> /malinowski/mvapich2/lib/libmpi.so.12(+0x33d770) [0x7f78a0aea770]
> [lap06:mpi_rank_0][print_backtrace]   4:
> /malinowski/mvapich2/lib/libmpi.so.12(+0x33ddd5) [0x7f78a0aeadd5]
> [lap06:mpi_rank_0][print_backtrace]   5:
> /malinowski/mvapich2/lib/libmpi.so.12(MPIDI_CH3I_CM_Connect+0x17c)
> [0x7f78a0aeb53c]
> [lap06:mpi_rank_0][print_backtrace]   6:
> /malinowski/mvapich2/lib/libmpi.so.12(MPIDI_CH3_iSendv+0x2ec)
> [0x7f78a0ab254c]
> [lap06:mpi_rank_0][print_backtrace]   7:
> /malinowski/mvapich2/lib/libmpi.so.12(MPIDI_CH3_EagerContigIsend+0xb8)
> [0x7f78a0aa2348]
> [lap06:mpi_rank_0][print_backtrace]   8:
> /malinowski/mvapich2/lib/libmpi.so.12(MPID_Isend+0x2de) [0x7f78a0aa7d5e]
> [lap06:mpi_rank_0][print_backtrace]   9:
> /malinowski/mvapich2/lib/libmpi.so.12(MPIC_Sendrecv+0x14e) [0x7f78a0a607de]
> [lap06:mpi_rank_0][print_backtrace]  10:
> /malinowski/mvapich2/lib/libmpi.so.12(MPIR_Intercomm_merge_impl+0x209)
> [0x7f78a0a055c9]
> [lap06:mpi_rank_0][print_backtrace]  11:
> /malinowski/mvapich2/lib/libmpi.so.12(PMPI_Intercomm_merge+0x370)
> [0x7f78a0a05c50]
> [lap06:mpi_rank_0][print_backtrace]  12: /malinowski/pmem2/worker()
> [0x400818]
> [lap06:mpi_rank_0][print_backtrace]  13:
> /lib64/libc.so.6(__libc_start_main+0xfd) [0x3a0761ed5d]
> [lap06:mpi_rank_0][print_backtrace]  14: /malinowski/pmem2/worker()
> [0x4006f9]
> [lap06:mpi_rank_0][print_backtrace]   0:
> /malinowski/mvapich2/lib/libmpi.so.12(print_backtrace+0x1e)
> [0x7feba435af4e]
> [lap06:mpi_rank_0][print_backtrace]   1:
> /malinowski/mvapich2/lib/libmpi.so.12(error_sighandler+0x59)
> [0x7feba435b059]
> [lap06:mpi_rank_0][print_backtrace]   2: /lib64/libc.so.6() [0x3a076326a0]
> [lap06:mpi_rank_0][print_backtrace]   3:
> /malinowski/mvapich2/lib/libmpi.so.12(+0x33d770) [0x7feba434d770]
> [lap06:mpi_rank_0][print_backtrace]   4:
> /malinowski/mvapich2/lib/libmpi.so.12(+0x33ddd5) [0x7feba434ddd5]
> [lap06:mpi_rank_0][print_backtrace]   5:
> /malinowski/mvapich2/lib/libmpi.so.12(MPIDI_CH3I_CM_Connect+0x17c)
> [0x7feba434e53c]
> [lap06:mpi_rank_0][print_backtrace]   6:
> /malinowski/mvapich2/lib/libmpi.so.12(MPIDI_CH3_iSendv+0x2ec)
> [0x7feba431554c]
> [lap06:mpi_rank_0][print_backtrace]   7:
> /malinowski/mvapich2/lib/libmpi.so.12(MPIDI_CH3_EagerContigIsend+0xb8)
> [0x7feba4305348]
> [lap06:mpi_rank_0][print_backtrace]   8:
> /malinowski/mvapich2/lib/libmpi.so.12(MPID_Isend+0x2de) [0x7feba430ad5e]
> [lap06:mpi_rank_0][print_backtrace]   9:
> /malinowski/mvapich2/lib/libmpi.so.12(MPIC_Sendrecv+0x14e) [0x7feba42c37de]
> [lap06:mpi_rank_0][print_backtrace]  10:
> /malinowski/mvapich2/lib/libmpi.so.12(MPIR_Intercomm_merge_impl+0x209)
> [0x7feba42685c9]
> [lap06:mpi_rank_0][print_backtrace]  11:
> /malinowski/mvapich2/lib/libmpi.so.12(PMPI_Intercomm_merge+0x370)
> [0x7feba4268c50]
> [lap06:mpi_rank_0][print_backtrace]  12: ./manager() [0x4007d9]
> [lap06:mpi_rank_0][print_backtrace]  13:
> /lib64/libc.so.6(__libc_start_main+0xfd) [0x3a0761ed5d]
> [lap06:mpi_rank_0][print_backtrace]  14: ./manager() [0x4006a9]
>
>
> ===================================================================================
> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> =   PID 120027 RUNNING AT lap06
> =   EXIT CODE: 11
> =   CLEANING UP REMAINING PROCESSES
> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>
> ===================================================================================
> [proxy:1:0 at lap06] HYDU_sock_write (utils/sock/sock.c:286): write error
> (Broken pipe)
> [proxy:1:0 at lap06] main (pm/pmiserv/pmip.c:272): unable to return exit
> status upstream
> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault
> (signal 11)
> This typically refers to a problem with your application.
> Please see the FAQ page for debugging suggestions
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20160121/2fc9bc2a/attachment-0001.html>


More information about the mvapich-discuss mailing list