[mvapich-discuss] MPI_Comm_connect/accept segfault during MPIDI_CH3I_comm_create

Jian Lin lin.2180 at osu.edu
Tue Nov 25 09:06:22 EST 2014


Hi, Neil,

We are testing with your reproducer. Will you please provide your
detailed configuration of MV2? You can post the output of "mpiname -a"
command. Thank you!

On Fri, 21 Nov 2014 14:20:54 -0800
Neil Spruit <nrspruit at gmail.com> wrote:

> Sure, please see my attached reproducer, to build and run please
> follow these steps:
> 
> 1) to build the test binaries run ./build.sh
> 2) once built scp the mpi_connect_accept_sink to /tmp on your target
> remote host
> 3) from the remote host goto /tmp and run "mpiexec -n 1
> ./mpi_connect_accept_sink" (this is the method in which this scenario
> is launched) this binary will open a port and write the MPI port to a
> file 4) From your main Host run "mpiexec -n 1 ./mpi_connect_accept
> remote_hostname" where remote_hostname is the hostname of the system
> that launched mpi_connect_accept_sink
> 5) Once launched on the host the mpi_connect_accept will wait for a
> key press from the user to read the remote host's opened port, then
> attempt to connect.
> 
> My current configuration is using an infiniband 1-1 connection
> between two machines with OFED 3.12 with mellanox cards.
> 
> So far I have tested this with both mpich and Intel MPI and both are
> able to connect and exit cleanly.
> 
> Thank you very much for looking into this issue!
> 
> Respectfully,
> Neil Spruit
> 
> On Fri, Nov 21, 2014 at 1:49 PM, Jonathan Perkins <
> perkinjo at cse.ohio-state.edu> wrote:
> 
> > Thanks for the info Neil. Is there a simple reproducer that you can
> > share with us? We'll take a look at it and see if what the problem
> > may be. On Nov 21, 2014 3:47 PM, "Neil Spruit" <nrspruit at gmail.com>
> > wrote:
> >
> >> Yes, I have  MV2_ENABLE_AFFINITY=0 and MV2_SUPPORT_DPM=1 both set
> >> in this case since I am performing dynamic process creation and
> >> using thread level "multiple". I have mvapich on my boxes built
> >> with --enable-threads=multiple.
> >>
> >> Thanks,
> >> Neil
> >>
> >> On Fri, Nov 21, 2014 at 12:35 PM, Jonathan Perkins <
> >> perkinjo at cse.ohio-state.edu> wrote:
> >>
> >>> Hi, have you tried setting MV2_SUPPORT_DPM=1? Please take a look
> >>> at
> >>>
> >>> http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.0.1-userguide.html#x1-22700011.73
> >>> for more information on this runtime variable.
> >>>
> >>
> >>



-- 
Jian Lin
http://linjian.org



More information about the mvapich-discuss mailing list