[mvapich-discuss] MPI_Comm_connect/accept segfault during MPIDI_CH3I_comm_create

Jonathan Perkins perkinjo at cse.ohio-state.edu
Fri Nov 21 17:47:55 EST 2014


Thank you for the reproducer and the instructions. Many of us are still on
travel (from SC14) but will look at this as soon as we can.
On Nov 21, 2014 4:21 PM, "Neil Spruit" <nrspruit at gmail.com> wrote:

> Sure, please see my attached reproducer, to build and run please follow
> these steps:
>
> 1) to build the test binaries run ./build.sh
> 2) once built scp the mpi_connect_accept_sink to /tmp on your target
> remote host
> 3) from the remote host goto /tmp and run "mpiexec -n 1
> ./mpi_connect_accept_sink" (this is the method in which this scenario is
> launched) this binary will open a port and write the MPI port to a file
> 4) From your main Host run "mpiexec -n 1 ./mpi_connect_accept
> remote_hostname" where remote_hostname is the hostname of the system that
> launched mpi_connect_accept_sink
> 5) Once launched on the host the mpi_connect_accept will wait for a key
> press from the user to read the remote host's opened port, then attempt to
> connect.
>
> My current configuration is using an infiniband 1-1 connection between two
> machines with OFED 3.12 with mellanox cards.
>
> So far I have tested this with both mpich and Intel MPI and both are able
> to connect and exit cleanly.
>
> Thank you very much for looking into this issue!
>
> Respectfully,
> Neil Spruit
>
> On Fri, Nov 21, 2014 at 1:49 PM, Jonathan Perkins <
> perkinjo at cse.ohio-state.edu> wrote:
>
>> Thanks for the info Neil. Is there a simple reproducer that you can share
>> with us? We'll take a look at it and see if what the problem may be.
>> On Nov 21, 2014 3:47 PM, "Neil Spruit" <nrspruit at gmail.com> wrote:
>>
>>> Yes, I have  MV2_ENABLE_AFFINITY=0 and MV2_SUPPORT_DPM=1 both set in
>>> this case since I am performing dynamic process creation and using thread
>>> level "multiple". I have mvapich on my boxes built
>>> with --enable-threads=multiple.
>>>
>>> Thanks,
>>> Neil
>>>
>>> On Fri, Nov 21, 2014 at 12:35 PM, Jonathan Perkins <
>>> perkinjo at cse.ohio-state.edu> wrote:
>>>
>>>> Hi, have you tried setting MV2_SUPPORT_DPM=1? Please take a look at
>>>>
>>>> http://mvapich.cse.ohio-state.edu/static/media/mvapich/mvapich2-2.0.1-userguide.html#x1-22700011.73
>>>> for more information on this runtime variable.
>>>>
>>>
>>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20141121/62e61c6c/attachment.html>


More information about the mvapich-discuss mailing list