[mvapich-discuss] MVAPICH-0.9.8 lockup with OFED-1.1-rc3

Andrew Dobbie adobbie at cims.carleton.ca
Fri Sep 8 15:48:16 EDT 2006


I don't have the 64bit RHEL installed right now.  I can install it this
afternoon and hopefully I will have time to test with it today.  I will
post my results as soon as I've done the tests.

-Andrew

On Fri, 2006-09-08 at 14:34 -0500, Sayantan Sur wrote:
> Hello Andrew,
> 
> Andrew Dobbie wrote:
> 
> >Hi Dr. Panda,
> >
> >I just tested using OFED-1.0 and I am getting the exact same results as
> >I did with 1.1-rc3 using gen2.  The systems I am using have 32bit RHEL4
> >update 3 installed.
> >
> >Is there anything else you would like me to try?
> >  
> >
> I am just wondering whether you see the same behavior with 64-bit RHEL4 
> using OFED-1.0/1.1?
> 
> Thanks,
> Sayantan.
> 
> >-Andrew
> >
> >On Fri, 2006-09-08 at 12:42 -0400, Dhabaleswar Panda wrote:
> >  
> >
> >>Andrew - Thanks for your note. Since OFED-1.1 is still being finalized
> >>and not released yet, we have not installed it on any of your systems
> >>yet.  We have tested MVAPICH 0.9.8 with OFED-1.0 and it works without
> >>any problem. On your system, do you see this problem with OFED-1.0 or
> >>only with OFED-1.1-rc3?
> >>
> >>Thanks, 
> >>
> >>DK
> >>
> >>
> >>    
> >>
> >>>I'm not sure if this problem is caused by MVAPICH directly but I'm
> >>>certainly you will know what the cause is.
> >>>
> >>>I downloaded and compiled mvapich-0.9.8 but my application would not run
> >>>and all the benchmarks failed to run.  Every application locks up in the
> >>>same spot so I've included the backtrace.
> >>>
> >>>Here's the backtrace I got attaching gdb to osu_bw.
> >>>
> >>>(gdb) bt
> >>>#0  0xb7f457c3 in smpi_init ()
> >>>from /usr/local/mvapich/lib/shared/libmpich.so.1.0
> >>>#1  0xb7f43bc2 in MPID_Init ()
> >>>from /usr/local/mvapich/lib/shared/libmpich.so.1.0
> >>>#2  0xb7f3aea8 in MPIR_Init ()
> >>>from /usr/local/mvapich/lib/shared/libmpich.so.1.0
> >>>#3  0xb7f3acbe in PMPI_Init ()
> >>>from /usr/local/mvapich/lib/shared/libmpich.so.1.0
> >>>#4  0x08048794 in main (argc=1, argv=0xbfd2e8e4) at osu_bw.c:80
> >>>
> >>>I managed to get everything running properly be removing -D_SMP_ and -
> >>>D_SMP_RNDV_ from the build flags.  I don't recall any warnings about -
> >>>D_SMP_ in the documentation.  Am I correct in assuming this should not
> >>>happen?
> >>>
> >>>I am using Mellanox PCI-X cards from Voltaire with 3.4.0 firmware and
> >>>OFED mthca driver on a 2.6.17 kernel.  All my machines are dual Opteron
> >>>248s.
> >>>
> >>>Thanks in advance.
> >>>
> >>>
> >>>_______________________________________________
> >>>mvapich-discuss mailing list
> >>>mvapich-discuss at mail.cse.ohio-state.edu
> >>>http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >>>
> >>>      
> >>>
> >
> >
> >_______________________________________________
> >mvapich-discuss mailing list
> >mvapich-discuss at mail.cse.ohio-state.edu
> >http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >  
> >
> 
> 




More information about the mvapich-discuss mailing list