[mvapich-discuss] mvapich-discuss Digest, Vol 95, Issue 1

Harisangam Sharvari Suhas harisangams at cdac.in
Wed Nov 13 07:07:50 EST 2013


hello,
   thanks for reply. can u tell me which source files to refer for this request/
reply mechanism in the source code?  and in which scenario the packet of type
MPIDI_CH3_PKT_ADDRESS will be used?
-Thanks
-Sharvari Harisangam
Centre for Developement of advanced computing(CDAC), Pune

On November 3, 2013 at 10:30 PM mvapich-discuss-request at cse.ohio-state.edu
wrote:
> Send mvapich-discuss mailing list submissions to
> mvapich-discuss at cse.ohio-state.edu
>
> To subscribe or unsubscribe via the World Wide Web, visit
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> or, via email, send a message with subject or body 'help' to
> mvapich-discuss-request at cse.ohio-state.edu
>
> You can reach the person managing the list at
> mvapich-discuss-owner at cse.ohio-state.edu
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of mvapich-discuss digest..."
>
>
> Today's Topics:
>
> 1. ISend and IRecv not finishing in Multithread-MPI
> (Roshan Dathathri)
> 2. Exchange of remote addresses for RDMA operations (harisangams)
> 3. Re: Exchange of remote addresses for RDMA operations
> (Hari Subramoni)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sat, 2 Nov 2013 21:57:10 +0530
> From: Roshan Dathathri <roshan at csa.iisc.ernet.in>
> To: <mvapich-discuss at cse.ohio-state.edu>
> Subject: [mvapich-discuss] ISend and IRecv not finishing in
> Multithread-MPI
> Message-ID:
> <CAEsqPfOh96EU6bpNzmz9vuye5Lf7DvpAi22o9pmmo+0OfwuCKw at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi,
>
> I am running a multi-threaded MPI program using MVAPICH2 1.8.1
> with MV2_ENABLE_AFFINITY=0. In each MPI node, there are multiple
> threads - one of these posts Irecv() to receive data from other nodes
> while the rest could post Irsend() (ready mode) to send data to the other
> nodes. Each thread periodically checks whether the posted
> communication calls have been completed using Test(). The application
> hangs since some of the sends and receives posted have not completed.
> Here are the statistics collected from debug logs (one per node) that
> were generated from an execution of the program:
> Across all nodes, total number of :-
> Irsend() posted: 50339
> Irecv() posted with matched Irsend(): 50339 (since it is ready mode)
> (more Irecv() could have been posted)
> Irsend() completed: 48062
> Irecv() completed: 47296
> For multiple runs on the same number of nodes, this behavior is
> consistent; though the actual numbers vary a lot, the relative difference
> does not vary by much.
> The behavior is similar if Irsend() is replaced with Issend() or Isend().
> The return value of all MPI calls are checked for errors. None of the calls
> return an error for the execution in consideration.
>
> What could be the issue for this unexpected behavior? Are there any
> compiler or runtime flags that would help debugging the issue?
>
> Machine information:
> 32-node InfiniBand cluster of dual-SMP Xeon servers. Each node on the
> cluster consists of two quad-core Intel Xeon E5430 2.66 GHz processors
> with 12 MB L2 cache and 16 GB RAM. The InfiniBand host adapter is a
> Mellanox MT25204 (InfiniHost III Lx HCA).
> The program was run on 32 nodes with 8 OpenMP threads on each node.
>
> Application information:
> A single thread on each node posts multiple anonymous Irecv()
> preemptively. Once it is receives data, it can produce tasks which need
> to be computed. The rest of the threads consume/compute these tasks,
> and can produce more tasks and post multiple Irsend().
> There is no wait or sleep anywhere in the program; the threads are
> spinning or busy-waiting.
>
> I can share the debug logs if required. Each log is a text file of around
> 6MB with detailed information of the execution on that node.
> I can also share the source files if required. All the source files put
> together would be a few thousand lines of code.
>
> Please let me know if you need more information.
>
> --
> Thanks,
> Roshan
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20131102/c242a4e5/attachment-0001.html>
>
> ------------------------------
>
> Message: 2
> Date: Thu, 31 Oct 2013 14:15:53 +0530
> From: harisangams <harisangams at cdac.in>
> To: mvapich discuss <mvapich-discuss at cse.ohio-state.edu>
> Subject: [mvapich-discuss] Exchange of remote addresses for RDMA
> operations
> Message-ID: <40FB45EDF192438A8259463B01973FD8 at sharvari>
> Content-Type: text/plain; charset="iso-8859-1"
>
> hello,
> I am looking at the source code of MVAPICH2 -1.9. Can you tell me, how the
> remote addresses of pre- registered buffers as well as dyamically allocated
> buffers are exchanged in code? Is it done through KVS ?
>
> -Thanks,
> Sharvari Harisangam
> Centre for Developement of advanced computing(CDAC), Pune
> -------------------------------------------------------------------------------------------------------------------------------
>
> This e-mail is for the sole use of the intended recipient(s) and may
> contain confidential and privileged information. If you are not the
> intended recipient, please contact the sender by reply e-mail and destroy
> all copies and the original message. Any unauthorized review, use,
> disclosure, dissemination, forwarding, printing or copying of this email
> is strictly prohibited and appropriate legal action will be taken.
> -------------------------------------------------------------------------------------------------------------------------------
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20131031/6dc1c115/attachment-0001.html>
>
> ------------------------------
>
> Message: 3
> Date: Sun, 3 Nov 2013 09:39:22 -0500
> From: Hari Subramoni <subramoni.1 at osu.edu>
> To: <harisangams at cdac.in>
> Cc: "mvapich-discuss at cse.ohio-state.edu"
> <mvapich-discuss at cse.ohio-state.edu>
> Subject: Re: [mvapich-discuss] Exchange of remote addresses for RDMA
> operations
> Message-ID:
> <CAGUk2tF3xUgUqbd1QzF=NaUAOXpDT0qOLtcmjYxULZYYh=m0kw at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hello Sharvari,
>
> In MVAPICH2, the KVS mechanism is not used to exchange buffer information.
> Dynamically as well as statically allocated buffer information is exchanged
> in-band using an Request / Reply mechanism.
>
> Thanks,
> Hari.
>
> -------------
>
>
> hello,
> I am looking at the source code of MVAPICH2 -1.9. Can you tell me,
> how the remote addresses of pre- registered buffers as well as
> dyamically allocated buffers are exchanged in code? Is it done
> through KVS ?
>
> -Thanks,
> Sharvari Harisangam
> Centre for Developement of advanced computing(CDAC), Pune
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
> <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20131103/e88c7a8c/attachment-0001.html>
>
> ------------------------------
>
> Subject: Digest Footer
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
> ------------------------------
>
> End of mvapich-discuss Digest, Vol 95, Issue 1
> **********************************************
-------------------------------------------------------------------------------------------------------------------------------

This e-mail is for the sole use of the intended recipient(s) and may
contain confidential and privileged information. If you are not the
intended recipient, please contact the sender by reply e-mail and destroy
all copies and the original message. Any unauthorized review, use,
disclosure, dissemination, forwarding, printing or copying of this email
is strictly prohibited and appropriate legal action will be taken.
-------------------------------------------------------------------------------------------------------------------------------

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20131113/18c028da/attachment-0001.html>


More information about the mvapich-discuss mailing list