[mvapich-discuss] Dual port HCA back-to-back woes

Dr. Dmitry Nolde nolde at nmr.ru
Wed Apr 18 12:31:58 EDT 2007


Hi,

Thanks for a very nice program.

We use mvapich version 0.9.5 on 24-node cluster (2 single-core Opteron + 
1 Mellanox infiniband adapters per node and 1 Mellanox switch) during 2 
years without any problem.

But now I'm trying to build new cluster based on nodes with two 
dual-core Opterons. I tried to build cluster without IB switch with ring 
topology of IB network (using port to port connection of HCA adapters). 
I understand that in this case I can use only 2 neighbor node to start 
one program. The mvapich version 0.9.9-beta2 is the only program which 
can realize this topology (thanks author to allow users explicitly 
specify adapter and port). But unfortunately for our program (molecular 
dynamics simulation) 1 port is sufficient for 2 processes per node (our 
old single-core Opteron cluster) and is not enough for 4 processes per 
node (new cluster). In this case we need to buy 2 IB switch to use 
two-port simultaneously.

Is it possible to connect 2 node twice by port to port manner? I read a 
discussion in this list and do not understand this. I do not see any 
reason why this configuration would not work. But  I tried to use both 
single-rail gen2 configuration and multi-rail gen 2 configuration 
without success. May be I need manually specify LID routing table. Does 
anybody have some idea about this?

Sincerely
Dmitry E. Nolde
Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry,
Moscow Russia



More information about the mvapich-discuss mailing list