[mvapich-discuss] help getting multirail working
James T Klosowski
jklosow at us.ibm.com
Fri Apr 14 13:49:09 EDT 2006
Abhinav,
Thanks so much for your immediate response! I reran the benchmark using
the NUM_PORTS and NUM_HCAS environment variables as you suggested, and it
worked just fine.
I was a little (ok, more than a little) disappointed in the BW I got, but
I'll continue working with it (trying the STRIPING_THRESHOLD environment
variable, and both ports on each HCA) to see if I can get some more... For
what it's worth, the first run maxed out at around 850MB for the bandwidth
test and 1100MB for the bi-directional test. Both of these values are
much less than your results for ia32 multirail (1712 and 1814 MB
respectively for the bw and bibw tests). (I am using EM64T machines
with PCI-X so that's the closest number to compare to). No doubt some
of that is because of my limited I/O bus speed in the nodes..., but I'll
see what else it may be. I am using the gcc compiler (not icc like your
tests)... I'll see if I can figure out why the big discrepancy.
Thanks again! I really appreciate your help.
Best,
Jim
Abhinav Vishnu <vishnu at cse.ohio-state.edu>
04/14/2006 12:50 PM
To
James T Klosowski/Watson/IBM at IBMUS
cc
mvapich-discuss at cse.ohio-state.edu
Subject
Re: [mvapich-discuss] help getting multirail working
Hi James,
Sorry, I forgot to mention about the MVAPICH user guide, which also
provides a list of configuration examples, debugging information and
also a list of environment variables, which can be used.
Please refer to the user guide at:
http://nowlab.cse.ohio-state.edu/projects/mpi-iba/mvapich_user_guide.html
In section 7 of the user guide, there are a couple of troubleshooting
examples. 7.3.3 is an example in which a user application aborts with
VAPI_RETRY_EXEC_ERROR.
VAPI provides a utility, vstat which can be used for checking the status
of the IB communication ports. As an example,
[vishnu at e8-lustre:~] vstat
hca_id=InfiniHost_III_Ex0
pci_location={BUS=0x04,DEV/FUNC=0x00}
vendor_id=0x02C9
vendor_part_id=0x6282
hw_ver=0xA0
fw_ver=5.1.0
PSID not available -- FW not installed using fail-safe mode
num_phys_ports=2
port=1
port_state=PORT_ACTIVE<-
sm_lid=0x0069
port_lid=0x00a9
port_lmc=0x00
max_mtu=2048
port=2
port_state=PORT_DOWN
sm_lid=0x0000
port_lid=0x00aa
port_lmc=0x00
max_mtu=2048
vstat on your machine(s) should list two HCAs. Please make sure that
the first port on both HCAs is in the PORT_ACTIVE state. In case they
are in the PORT_INITIALIZE state, subnet manager can be used in the
following manner:
[vishnu at e10-lustre:~] sudo opensm -o
>
> My current configuation is simply 2 nodes, each with 2 HCAs (MT23108). I
> downloaded the MVPAICH 0.9.7 version (for VAPI) and compiled it using
the
> TopSpin stack (3.1.0-113).
>
> I'm running on RHEL 4 U1 machines. In one machine, both HCAs are on
> differnt PCI-X 133 buses, in the other machine one HCA is on a 133 bus
and
> the other is on a 100Hz bus.
Even though both machines do not have exactly similar configuration, it
should not be a problem to get them running together using multirail.
>
>
> I first compiled using make.mvapich.vapi and was able to run the OSU
> benchmarks without any problems.
>
> I then compiled successfully using make.mvapich.vapi_multirail, but when
I
> tried to run the OSU benchmaks, I get VAPI_RETRY_EXC_ERR midway through
> the benchmark, ... presumably when the code is finally trying to use the
> 2nd rail.
>
> Below is the output of my benchmark run. It is consistent in that it
will
> always fail after the 4096 test. Again, using the version compiled
> without mulitrail support works just fine (without changing anything
other
> than the version of mvapich I'm using).
>
In my previous email, i forgot to mention about the environment variable
STRIPING_THRESHOLD. The multirail MVAPICH uses this value to determine
whether a message would be striped across multiple available paths. This
could be a combination of multiple ports and multiple HCAs.
Section 9.4 and 9.5 of the user_guide talk about environment variable
NUM_PORTS and NUM_HCAS. A combination of these values can be used at the
same time. For example, if there is a cluster with each node having 2 HCAs
and 2 Ports per HCA, setting up NUM_PORTS=2 and NUM_HCAS=2 would allow
multirail to use all ports and all HCAs.
> If you have any suggestions on what to try, I'd appreciate it. I'm not
> exactly sure how I should set up the IP addresses... so I included that
> information below too. I am using only one port on each of the two
HCAs,
> and all four cables connect to the same TopSpin TS120 switch.
>
A following change in the command line should solve the problem for you:
./mpirun_rsh -rsh -np 2 -hostfile /root/hostfile
NUM_PORTS=1 NUM_HCAS=2 /root/OSU-benchmarks/osu_bw
Please let us know if the problem persists.
Thanks and best regards,
-- Abhinav
> Thanks in advance!
>
> Jim
>
>
>
> ./mpirun_rsh -rsh -np 2 -hostfile /root/hostfile
> /root/OSU-benchmarks/osu_bw
>
> # OSU MPI Bandwidth Test (Version 2.2)
> # Size Bandwidth (MB/s)
> 1 0.284546
> 2 0.645845
> 4 1.159683
> 8 2.591093
> 16 4.963886
> 32 10.483747
> 64 20.685824
> 128 36.271862
> 256 78.276241
> 512 146.724578
> 1024 237.888853
> 2048 295.633345
> 4096 347.127837
> [0] Abort: [vis460.watson.ibm.com:0] Got completion with error,
> code=VAPI_RETRY_EXC_ERR, vendor code=81
> at line 2114 in file viacheck.c
> Timeout alarm signaled
> Cleaning up all processes ...done.
>
>
> My machine file is just the 2 hostnames:
>
> cat /root/hostfile
> vis460
> vis30
>
>
>
>
> ifconfig
> eth0 Link encap:Ethernet HWaddr 00:0D:60:98:20:B8
> inet addr:9.2.12.221 Bcast:9.2.15.255 Mask:255.255.248.0
> inet6 addr: fe80::20d:60ff:fe98:20b8/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:9787508 errors:841 dropped:0 overruns:0 frame:0
> TX packets:1131808 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:1000
> RX bytes:926406322 (883.4 MiB) TX bytes:94330491 (89.9 MiB)
> Interrupt:185
>
> ib0 Link encap:Ethernet HWaddr 93:C9:C9:6F:5D:7C
> inet addr:10.10.5.46 Bcast:10.10.5.255 Mask:255.255.255.0
> inet6 addr: fe80::6bc9:c9ff:fe66:c15b/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1
> RX packets:175 errors:0 dropped:0 overruns:0 frame:0
> TX packets:174 errors:0 dropped:18 overruns:0 carrier:0
> collisions:0 txqueuelen:128
> RX bytes:11144 (10.8 KiB) TX bytes:11638 (11.3 KiB)
>
> ib2 Link encap:Ethernet HWaddr 65:9A:4B:CF:8D:00
> inet addr:12.12.5.46 Bcast:12.12.5.255 Mask:255.255.255.0
> inet6 addr: fe80::c19a:4bff:fed2:f3a0/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:2044 Metric:1
> RX packets:257 errors:0 dropped:0 overruns:0 frame:0
> TX packets:235 errors:0 dropped:30 overruns:0 carrier:0
> collisions:0 txqueuelen:128
> RX bytes:15180 (14.8 KiB) TX bytes:15071 (14.7 KiB)
>
> lo Link encap:Local Loopback
> inet addr:127.0.0.1 Mask:255.0.0.0
> inet6 addr: ::1/128 Scope:Host
> UP LOOPBACK RUNNING MTU:16436 Metric:1
> RX packets:14817 errors:0 dropped:0 overruns:0 frame:0
> TX packets:14817 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:0
> RX bytes:7521844 (7.1 MiB) TX bytes:7521844 (7.1 MiB)
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20060414/e8f5ce91/attachment-0001.html
More information about the mvapich-discuss
mailing list