[mvapich-discuss] ISend and IRecv not finishing in Multithread-MPI

Roshan Dathathri roshan at csa.iisc.ernet.in
Sat Nov 2 12:27:10 EDT 2013


Hi,

I am running a multi-threaded MPI program using MVAPICH2 1.8.1
with MV2_ENABLE_AFFINITY=0. In each MPI node, there are multiple
threads - one of these posts Irecv() to receive data from other nodes
while the rest could post Irsend() (ready mode) to send data to the other
nodes. Each thread periodically checks whether the posted
communication calls have been completed using Test(). The application
hangs since some of the sends and receives posted have not completed.
Here are the statistics collected from debug logs (one per node) that
were generated from an execution of the program:
Across all nodes, total number of :-
Irsend() posted: 50339
Irecv() posted with matched Irsend(): 50339 (since it is ready mode)
(more Irecv() could have been posted)
Irsend() completed: 48062
Irecv() completed: 47296
For multiple runs on the same number of nodes, this behavior is
consistent; though the actual numbers vary a lot, the relative difference
does not vary by much.
The behavior is similar if Irsend() is replaced with Issend() or Isend().
The return value of all MPI calls are checked for errors. None of the calls
return an error for the execution in consideration.

What could be the issue for this unexpected behavior? Are there any
compiler or runtime flags that would help debugging the issue?

Machine information:
32-node InfiniBand cluster of dual-SMP Xeon servers. Each node on the
cluster consists of two quad-core Intel Xeon E5430 2.66 GHz processors
with 12 MB L2 cache and 16 GB RAM. The InfiniBand host adapter is a
Mellanox MT25204 (InfiniHost III Lx HCA).
The program was run on 32 nodes with 8 OpenMP threads on each node.

Application information:
A single thread on each node posts multiple anonymous Irecv()
preemptively. Once it is receives data, it can produce tasks which need
to be computed. The rest of the threads consume/compute these tasks,
and can produce more tasks and post multiple Irsend().
There is no wait or sleep anywhere in the program; the threads are
spinning or busy-waiting.

I can share the debug logs if required. Each log is a text file of around
6MB with detailed information of the execution on that node.
I can also share the source files if required. All the source files put
together would be a few thousand lines of code.

Please let me know if you need more information.

-- 
Thanks,
Roshan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20131102/c242a4e5/attachment.html>


More information about the mvapich-discuss mailing list