[mvapich-discuss] mvapich thread multiple problem
Marcin Zalewski
marcin.zalewski at gmail.com
Tue Mar 12 17:20:26 EDT 2013
Hello.
I am using mvapich 1.9b with QLogic (Intel) adapters. I configured
mvapich like this:
./configure --enable-fast=all,O3 --enable-thread-cs=per-object
--enable-refcount=lock-free --enable-handle-allocation=tls
--with-atomic-primitives --enable-shared --with-ch3-rank-bits=16
--enable-hybrid --with-device=ch3:psm
--with-psm=/xyz/infinipath-psm-3.1-364.1140_open/usr
I am trying to run a simple test application with 1 thread on 2 hosts,
but I get no progress. Upon investigation, it seems that my
application is stuck in mvapich (trace at the end of the email). The
same test works OK with mpich. I am wondering what should I do to
debug this further. Could it be a problem with my psm installation? I
am able to run the same application in thread serialized mode. I would
appreciate any pointers you could give me on what to do next.
Thank you,
Marcin
(gdb) info thread
Id Target Id Frame
2 Thread 0x7fa2d4c39700 (LWP 15802) "mpi_test_bfs_th"
0x00007fa2d63fc303 in poll () from /lib/x86_64-linux-gnu/libc.so.
6
* 1 Thread 0x7fa2d8191b40 (LWP 15801) "mpi_test_bfs_th"
0x00007fa2d6ef4a62 in ?? () from /lib/x86_64-linux-gnu/libpthread.so.0
(gdb) bt
#0 0x00007fa2d6ef4a62 in ?? () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00007fa2d7a11603 in psm_irecv () from
/opt/mvapich/2-1.9b/lib/libmpich.so.10
#2 0x00007fa2d7a0927d in MPID_Irecv () from
/opt/mvapich/2-1.9b/lib/libmpich.so.10
#3 0x00007fa2d79cef63 in MPIC_Sendrecv () from
/opt/mvapich/2-1.9b/lib/libmpich.so.10
#4 0x00007fa2d79cf717 in MPIC_Sendrecv_ft () from
/opt/mvapich/2-1.9b/lib/libmpich.so.10
#5 0x00007fa2d7a5fca7 in MPIR_Allreduce_pt2pt_rd_MV2 () from
/opt/mvapich/2-1.9b/lib/libmpich.so.10
#6 0x00007fa2d7a62cda in MPIR_Allreduce_new_MV2 () from
/opt/mvapich/2-1.9b/lib/libmpich.so.10
#7 0x00007fa2d79dff96 in MPIR_Get_contextid_sparse_group () from
/opt/mvapich/2-1.9b/lib/libmpich.so.10
#8 0x00007fa2d79e08d0 in MPIR_Comm_copy () from
/opt/mvapich/2-1.9b/lib/libmpich.so.10
#9 0x00007fa2d7a7d3ce in MPIR_Comm_dup_impl () from
/opt/mvapich/2-1.9b/lib/libmpich.so.10
#10 0x00007fa2d7a7d422 in PMPI_Comm_dup () from
/opt/mvapich/2-1.9b/lib/libmpich.so.10
... [snip]
(gdb) thread 2
[Switching to thread 2 (Thread 0x7fa2d4c39700 (LWP 15802))]
#0 0x00007fa2d63fc303 in poll () from /lib/x86_64-linux-gnu/libc.so.6
(gdb) bt
#0 0x00007fa2d63fc303 in poll () from /lib/x86_64-linux-gnu/libc.so.6
#1 0x00007fa2d7125587 in ips_ptl_pollintr (rcvthreadc=0x1dcbae8) at
ptl_rcvthread.c:322
#2 0x00007fa2d6eefe9a in start_thread () from
/lib/x86_64-linux-gnu/libpthread.so.0
#3 0x00007fa2d6407cbd in clone () from /lib/x86_64-linux-gnu/libc.so.6
#4 0x0000000000000000 in ?? ()
More information about the mvapich-discuss
mailing list