[mvapich-discuss] collectives fail under mvapich2-1.0 (fwd)
Edmund Sumbar
esumbar at ualberta.ca
Thu Sep 27 10:27:49 EDT 2007
Thanks for your prompt reply Wei...
wei huang wrote:
> We wonder on how many processes you carried out the tests? Are you using
> all 8 processors when running coll tests and SKaMPI tests?
The tests passed for the 2- and 4-cpu cases within a node.
The tests stall as soon as I go between nodes, for example,
using one cpu on each of two nodes.
Our nodes are running Linux kernel 2.6.21-smp, with the
associated Infiniband driver modules installed. Please
let me know what other information you might need to
investigate this issue.
>> I'm trying to confirm my mvapich2-1.0 installation by running
>> SKaMPI (pt2pt, coll, onesided), Intel MPI Benchmark (MPI1 only)
>> and the mvapich2 coll tests.
>>
>> Running the mvapich2 coll tests in alphabetical order, I find
>> that it stalls on "icallgather," that is, no output after
>> several minutes and the next test is not run. I also experience
>> stalling with the coll and onesided SKaMPI tests. The IMB-MPI1
>> tests pass however.
>>
>> Mvapich2-1.0 was compiled using make.mvapich2.ofa (gcc 4.2.0).
>> The tests were run under Torque using mpiexec-0.81 between two
>> nodes (dual-socket, dual-core Opterons).
>>
>> for test in allgather2 allgather3 ...; do
>> mpiexec $test >out 2>err
>> done
>>
>> No problem, when run within a node.
>>
>> Any ideas?
--
Ed[mund [Sumbar]]
AICT Research Support, Univ of Alberta
More information about the mvapich-discuss
mailing list