[mvapich-discuss] problems with mvapich2-2.0rc1 and Mellanox OFED 2.1.1
Konz, Jeffrey (SSA Solution Centers)
jeffrey.konz at hp.com
Tue Apr 1 13:24:21 EDT 2014
I have problem with a code that hangs during MPI_Finalize with mvapich2-rc1 on system running Mellanox OFED 2.1.1.
The same code runs fine with mvapich2-2.0b.
Backtrace from one of the hung processes:
(gdb) backtrace
#0 0x00002b0681440d03 in MPIDI_CH3I_SMP_read_progress ()
from /usr3/konz/Apps/MVAPICH/mvapich2-2.0rc1-gnu/lib/libmpich.so.12
#1 0x00002b0681438113 in MPIDI_CH3I_Progress ()
from /usr3/konz/Apps/MVAPICH/mvapich2-2.0rc1-gnu/lib/libmpich.so.12
#2 0x00002b06813e32e7 in MPIC_Wait ()
from /usr3/konz/Apps/MVAPICH/mvapich2-2.0rc1-gnu/lib/libmpich.so.12
#3 0x00002b06813e3491 in MPIC_Sendrecv ()
from /usr3/konz/Apps/MVAPICH/mvapich2-2.0rc1-gnu/lib/libmpich.so.12
#4 0x00002b06815002cc in MPIR_Pairwise_Barrier_MV2 ()
from /usr3/konz/Apps/MVAPICH/mvapich2-2.0rc1-gnu/lib/libmpich.so.12
#5 0x00002b0681500407 in MPIR_Barrier_intra_MV2 ()
from /usr3/konz/Apps/MVAPICH/mvapich2-2.0rc1-gnu/lib/libmpich.so.12
#6 0x00002b06815005e9 in MPIR_Barrier_MV2 ()
from /usr3/konz/Apps/MVAPICH/mvapich2-2.0rc1-gnu/lib/libmpich.so.12
#7 0x00002b06814a502f in MPIR_Barrier_impl ()
from /usr3/konz/Apps/MVAPICH/mvapich2-2.0rc1-gnu/lib/libmpich.so.12
#8 0x00002b06815cfc5b in PMPI_Finalize ()
from /usr3/konz/Apps/MVAPICH/mvapich2-2.0rc1-gnu/lib/libmpich.so.12
#9 0x0000000000401b4d in main (argc=2, argv=0x7fff72e175d8) at fft1d_mpi.c:472
(gdb) quit
RedHat 6.5, kernel 2.6.32-431.el6.x86_64
% rpm -qa | grep ofed
ofed-scripts-2.1-OFED.2.1.1.0.0.x86_64
mlnxofed-docs-2.1-1.0.0.noarch
% ibv_devinfo
hca_id: mlx4_0
transport: InfiniBand (0)
fw_ver: 2.30.3200
node_guid: 24be:05ff:ffa5:0230
sys_image_guid: 24be:05ff:ffa5:0233
vendor_id: 0x02c9
vendor_part_id: 4099
hw_ver: 0x1
board_id: HP_0230240019
phys_port_cnt: 2
port: 1
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 4096 (5)
sm_lid: 136
port_lid: 212
port_lmc: 0x00
link_layer: InfiniBand
port: 2
state: PORT_ACTIVE (4)
max_mtu: 4096 (5)
active_mtu: 1024 (3)
sm_lid: 0
port_lid: 0
port_lmc: 0x00
link_layer: Ethernet
Thanks,
-Jeff
/**********************************************************/
/* Jeff Konz jeffrey.konz at hp.com */
/* Solutions Architect HPC Benchmarking */
/* Americas Strategic Solutions Architecture (SSA) */
/* Hewlett-Packard Company */
/* Office: 248-491-7480 Mobile: 248-345-6857 */
/**********************************************************/
More information about the mvapich-discuss
mailing list