[mvapich-discuss] OpenFabrics
Sayantan Sur
surs at cse.ohio-state.edu
Fri Aug 4 17:16:38 EDT 2006
Hi Michael,
Di Domenico, Michael wrote:
> I just built an OpenFabric v1.0 cluster of two machines on a pair of
> Quad Proc Itanium servers. Both machines have Mellanox HCA’s with
> Mellanox firmware.
>
> I also downloaded the OSU version of mvapich from the website instead
> of using the bundled version.
>
> Everything compiles fine, simple cpi tests work okay, netpipe runs
> okay, so I’m pretty sure my fabric is okay.
>
> But when I try to run osu_latency, osu_bw, or osu_bibw tests, it just
> stalls.
>
> How can I determine where the program is stalling?
>
In order to see where the programs are stalling, you can just build
MVAPICH with gdb (by inserting -ggdb in the CFLAGS in
make.mvapich.gen2). After you're done with the build, run the test as
usual. To see which function is hanging, you can just ssh to the node on
which the test is running; find out the process id and execute the
following commands:
$ gdb attach <PID>
(gdb) bt
This will show which function the test is hanging in.
However, there should be no hanging in the first place. Can you run
other benchmarks like IMB etc? Also did you modify make.mvapich.gen2 at
all before running these tests? If so, what were the flags you used?
Thanks,
Sayantan.
--
http://www.cse.ohio-state.edu/~surs
More information about the mvapich-discuss
mailing list