[mvapich-discuss] MVAPICH-0.9.8 lockup with OFED-1.1-rc3

Andrew Dobbie adobbie at cims.carleton.ca
Fri Sep 8 11:09:11 EDT 2006


I'm not sure if this problem is caused by MVAPICH directly but I'm
certainly you will know what the cause is.

I downloaded and compiled mvapich-0.9.8 but my application would not run
and all the benchmarks failed to run.  Every application locks up in the
same spot so I've included the backtrace.

Here's the backtrace I got attaching gdb to osu_bw.

(gdb) bt
#0  0xb7f457c3 in smpi_init ()
from /usr/local/mvapich/lib/shared/libmpich.so.1.0
#1  0xb7f43bc2 in MPID_Init ()
from /usr/local/mvapich/lib/shared/libmpich.so.1.0
#2  0xb7f3aea8 in MPIR_Init ()
from /usr/local/mvapich/lib/shared/libmpich.so.1.0
#3  0xb7f3acbe in PMPI_Init ()
from /usr/local/mvapich/lib/shared/libmpich.so.1.0
#4  0x08048794 in main (argc=1, argv=0xbfd2e8e4) at osu_bw.c:80

I managed to get everything running properly be removing -D_SMP_ and -
D_SMP_RNDV_ from the build flags.  I don't recall any warnings about -
D_SMP_ in the documentation.  Am I correct in assuming this should not
happen?

I am using Mellanox PCI-X cards from Voltaire with 3.4.0 firmware and
OFED mthca driver on a 2.6.17 kernel.  All my machines are dual Opteron
248s.

Thanks in advance.




More information about the mvapich-discuss mailing list