[mvapich-discuss] MPI_Comm_connect/accept segfault during MPIDI_CH3I_comm_create

Neil Spruit nrspruit at gmail.com
Tue Nov 25 18:54:26 EST 2014


Sure here it is:

MVAPICH2 2.0 Fri Jun 20 20:00:00 EDT 2014 ch3:mrail

Compilation
CC: gcc    -DNDEBUG -DNVALGRIND -g -O2
CXX: g++   -DNDEBUG -DNVALGRIND -g
F77: gfortran -L/lib -L/lib   -g -O2
FC: gfortran   -g

Configuration
--enable-g=debug --enable-threads=multiple

Thanks,
Neil


On Tue, Nov 25, 2014 at 6:06 AM, Jian Lin <lin.2180 at osu.edu> wrote:

> Hi, Neil,
>
> We are testing with your reproducer. Will you please provide your
> detailed configuration of MV2? You can post the output of "mpiname -a"
> command. Thank you!
>
> On Fri, 21 Nov 2014 14:20:54 -0800
> Neil Spruit <nrspruit at gmail.com> wrote:
>
> > Sure, please see my attached reproducer, to build and run please
> > follow these steps:
> >
> > 1) to build the test binaries run ./build.sh
> > 2) once built scp the mpi_connect_accept_sink to /tmp on your target
> > remote host
> > 3) from the remote host goto /tmp and run "mpiexec -n 1
> > ./mpi_connect_accept_sink" (this is the method in which this scenario
> > is launched) this binary will open a port and write the MPI port to a
> > file 4) From your main Host run "mpiexec -n 1 ./mpi_connect_accept
> > remote_hostname" where remote_hostname is the hostname of the system
> > that launched mpi_connect_accept_sink
> > 5) Once launched on the host the mpi_connect_accept will wait for a
> > key press from the user to read the remote host's opened port, then
> > attempt to connect.
> >
> > My current configuration is using an infiniband 1-1 connection
> > between two machines with OFED 3.12 with mellanox cards.
> >
> > So far I have tested this with both mpich and Intel MPI and both are
> > able to connect and exit cleanly.
> >
> > Thank you very much for looking into this issue!
> >
> > Respectfully,
> > Neil Spruit
> >
> > On Fri, Nov 21, 2014 at 1:49 PM, Jonathan Perkins <
> > perkinjo at cse.ohio-state.edu> wrote:
> >
> > > Thanks for the info Neil. Is there a simple reproducer that you can
> > > share with us? We'll take a look at it and see if what the problem
> > > may be. On Nov 21, 2014 3:47 PM, "Neil Spruit" <nrspruit at gmail.com>
> > > wrote:
> > >
> > >> Yes, I have  MV2_ENABLE_AFFINITY=0 and MV2_SUPPORT_DPM=1 both set
> > >> in this case since I am performing dynamic process creation and
> > >> using thread level "multiple". I have mvapich on my boxes built
> > >> with --enable-threads=multiple.
> > >>
> > >> Thanks,
> > >> Neil
> > >>
> > >> On Fri, Nov 21, 2014 at 12:35 PM, Jonathan Perkins <
> > >> perkinjo at cse.ohio-state.edu> wrote:
> > >>
> > >>> Hi, have you tried setting MV2_SUPPORT_DPM=1? Please take a look
> > >>> at
> > >>>
> > >>>
> http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.0.1-userguide.html#x1-22700011.73
> > >>> for more information on this runtime variable.
> > >>>
> > >>
> > >>
>
>
>
> --
> Jian Lin
> http://linjian.org
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20141125/ec53ae44/attachment.html>


More information about the mvapich-discuss mailing list