[mvapich-discuss] collectives fail under mvapich2-1.0 (fwd)
Edmund Sumbar
esumbar at ualberta.ca
Mon Oct 1 15:12:35 EDT 2007
amith rajith mamidala wrote:
> We were able to run the 12 process test for collectives on 3 nodes.
> Can you provide us some details as to how the processes were launched?
> e.g. block or cyclic or any other distrubution.
Hi Amith,
I've been running the tests as batch jobs
through Torque/Maui using the mpiexec
program. All parameters are defaults,
as far as I know. Typical job script is
#!/bin/bash
#PBS -S /bin/bash
#PBS -l nodes=3:ppn=4
#PBS -l pvmem=1gb
#PBS -W x=QOS:test
test=coll
size=3x4
mpiexec=/usr/local/mpiexec/bin/mpiexec
skampi=/scratch/esumbar/mpi-test.d/skampi/mvapich/skampi-5.0.1-r0191/skampi
cd $PBS_O_WORKDIR
$mpiexec $skampi -i ${test}.ski -o ${test}_ib-${size}.sko
Mpiexec details...
$ /usr/local/mpiexec/bin/mpiexec --version
Version 0.81, configure options: '--with-pbs=/opt/torque'
'--with-default-comm=mpich2-pmi' '--prefix=/usr/local/mpiexec'
System...
$ uname -a
Linux 2.6.21-smp #1 SMP Tue Aug 7 12:45:20 MDT 2007 GNU/Linux
Routing table...
$ netstat -r
Kernel IP routing table
Destination Gateway Genmask Flags MSS Window irtt Iface
255.255.255.255 * 255.255.255.255 UH 0 0 0 eth0
10.0.6.0 * 255.255.255.0 U 0 0 0 ib0
129.128.125.0 * 255.255.255.0 U 0 0 0 eth1
192.168.44.0 * 255.255.255.0 U 0 0 0 vmnet8
192.168.43.0 * 255.255.255.0 U 0 0 0 vmnet1
10.0.0.0 * 255.255.0.0 U 0 0 0 eth0
224.0.0.0 * 240.0.0.0 U 0 0 0 eth0
default gateway.nic.ual 0.0.0.0 UG 0 0 0 eth1
Please let me know if you need further info.
Is there a diagnostic mode that can be
enabled?
Could there be some MVAPICH2 parameter
that needs adjusting from its default
value?
--
Ed[mund [Sumbar]]
AICT Research Support, Univ of Alberta
More information about the mvapich-discuss
mailing list