[mvapich-discuss] Error in MPI_Neighbor_alltoallv

Hari Subramoni subramoni.1 at osu.edu
Thu Nov 19 15:08:31 EST 2015


Hello,

This is really an out of memory situation. We are working on a patch for
this. We will get back to you soon. Do you happen to have a reproducer for
the error? Could you also let us know your system configuration and the
version of MVAPICH2 you're using?

Thx,
Hari.
On Nov 18, 2015 2:17 PM, "Phanisri Pradeep Pratapa" <ppratapa at gatech.edu>
wrote:

> Hi,
>
> I am running a C++ code with MPI 3.0 through mvapich2/2.1.
>
> I use MPI_Neighbor_alltoallv in my code and it needs to be called in every
> iteration. I have created a periodic cartesian topology to enable local
> communication. I found that this function works correctly for a few
> iterations and then fails after that giving the following error:
>
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
>
>
>
>
>
>
>
> *[cli_187]: aborting job:Fatal error in PMPI_Ineighbor_alltoallv:Other MPI
> error, error stack:PMPI_Ineighbor_alltoallv(229).......:
> MPI_Ineighbor_alltoallv(sendbuf=0x2aaac9fa5a20, sendcounts=0x2aaac81f1ab0,
> sdispls=0x2aaac81f05d0, sendtype=MPI_DOUBLE, recvbuf=0x2aaac9f96050,
> recvcounts=0x2aaac81f4470, rdispls=0x2aaac81f2f90, recvtype=MPI_DOUBLE,
> comm=comm=0x84000006, request=0x7fffffPMPI_Ineighbor_alltoallv(215).......:
> MPIR_Ineighbor_alltoallv_impl(112)..: MPIR_Ineighbor_alltoallv_default(78):
> MPID_Sched_recv(599)................: MPIDU_Sched_add_entry(425)..........:
> Out of memory*
> %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
>
> This happens only when each processor is communicating with all the
> processors (or more, since periodic) and the total number of processors is
> greater than or equal to 216 (4 nodes). The function works fine for all
> other cases I have tested. This happens for both blocking as well as
> non-blocking versions. Moreover I encounter this kind of behaviour probably
> about 8 out of 10 times I run the code (with the same inputs, commands,
> options etc.) and the other 2 times it actually works out successfully. I
> have debugged/run memory checks and found no memory leaks.
>
> There was a similar problem I found on this forum which someone else had
> experienced, but there seems to be no final response to it:
>
> http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/2014-June/005002.html
>
> Please let me know if somebody can help.
>
> Thank you,
>
> Regards,
>
> Pradeep
>
>
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20151119/b7a5ff79/attachment.html>


More information about the mvapich-discuss mailing list