[mvapich-discuss] Differing IB interfaces

LEI CHAI chai.15 at osu.edu
Fri Jul 27 14:39:26 EDT 2007


Hi Adam,

An alternative workaround is that you can try to plug your IB card to other PCI slots, and see which slot can make solaris name it "ibd0". 

We will be working on a better solution for this kind of "heterogeneous" system.

Lei


----- Original Message -----
From: Shaun Rowland <rowland at cse.ohio-state.edu>
Date: Friday, July 27, 2007 10:32 am
Subject: Re: [mvapich-discuss] Differing IB interfaces

> Lundrigan, Adam wrote:
> > We’re using MVAPICH2 with Infiniband on a 5-node Sun/Solaris 
> cluster 
> > (Sun Fire x4100/x4200), and are having a problem with 
> consistency in the 
> > naming of our ibd interfaces.  On the x4100 nodes, the IPoIB 
> interface 
> > is ibd0.  However, on the head node (x4200), the interface is 
> ibd2.  
> > We’ve tried everything short of wiping the machine and 
> reinstalling the 
> > OS to force one of the two HCAs to have an ibd0, but thus far we 
> have 
> > failed.  The only choices Solaris seems to use are ibd2, ibd3, 
> ibd6 and 
> > ibd7 (we have 2 cards w/ 4 ports in that node)
> 
> If I understand this correctly, the head node is ibd2 and the rest are
> ibd0? In this case, you can use the machinefile argument to 
> mpiexec to
> setup where each process number is going to run. Once you do that, you
> can use the mpiexec facility for passing specific environment 
> variablesto each "group" of processes. For example, I have 5 hosts:
> 
> [rowland at s8 mvapich2-0.9.8-install]$ cat hosts
> s8
> s9
> s10
> s11
> s12
> 
> Let's assume that s8 is your head node.  I start mpdboot on all hosts:
> 
> [rowland at s8 mvapich2-0.9.8-install]$ bin/mpdboot -n 5 -f hosts
> [rowland at s8 mvapich2-0.9.8-install]$ bin/mpdtrace
> s8
> s12
> s11
> s10
> s9
> 
> I've decided that I want to run 10 processes (2 on each node). So I
> create the following machinefile file:
> 
> [rowland at s8 mvapich2-0.9.8-install]$ cat machines
> s8:2
> s9:2
> s10:2
> s11:2
> s12:2
> 
> This specifies exactly how many processes run on each host in 
> order. I
> can use this with mpiexec in the following way:
> 
> [rowland at s8 mvapich2-0.9.8-install]$ bin/mpiexec -machinefile 
> machines 
> -n 2 -env MV2_DAPL_PROVIDER a ./test.sh : -n 8 -env 
> MV2_DAPL_PROVIDER b 
> ./test.sh
> MV2_DAPL_PROVIDER = |a| [s8]
> MV2_DAPL_PROVIDER = |b| [s12]
> MV2_DAPL_PROVIDER = |b| [s11]
> MV2_DAPL_PROVIDER = |b| [s11]
> MV2_DAPL_PROVIDER = |b| [s12]
> MV2_DAPL_PROVIDER = |b| [s10]
> MV2_DAPL_PROVIDER = |a| [s8]
> MV2_DAPL_PROVIDER = |b| [s10]
> MV2_DAPL_PROVIDER = |b| [s9]
> MV2_DAPL_PROVIDER = |b| [s9]
> 
> I told mpiexec to use that machinefile to figure out the ordering, 
> thenI said:
> 
> - for the first 2 processes, set MV2_DAPL_PROVIDER to "a"
> - for the next 8 processes, set MV2_DAPL_PROVIDER to "b"
> 
> For this to work, you have to have the machinefile setup correctly 
> (the 
> :2 means the number of processes to run on a host, you can leave 
> it out 
> if just using one).  You can't run more processes than the 
> machinefile 
> specifies.
> 
> You might be able to try a simple case like this just to see if it 
> works 
> (1 process per node):
> 
> [rowland at s8 mvapich2-0.9.8-install]$ cat machines
> s8
> s9
> s10
> s11
> s12
> 
> [rowland at s8 mvapich2-0.9.8-install]$ bin/mpiexec -machinefile 
> machines 
> -n 1 -env MV2_DAPL_PROVIDER ibd2 ./test.sh : -n 4 -env 
> MV2_DAPL_PROVIDER 
> ibd0 ./test.sh
> MV2_DAPL_PROVIDER = |ibd0| [s12]
> MV2_DAPL_PROVIDER = |ibd2| [s8]
> MV2_DAPL_PROVIDER = |ibd0| [s11]
> MV2_DAPL_PROVIDER = |ibd0| [s10]
> MV2_DAPL_PROVIDER = |ibd0| [s9]
> 
> Of course, using your own test program.  The login shell startup 
> files 
> are not useful here.  The mpiexec process manages passing the 
> environment variables to processes it launches.  By default, it 
> passes 
> all environment variables in the current environment when started.
> 
> -- 
> Shaun Rowland	rowland at cse.ohio-state.edu
> http://www.cse.ohio-state.edu/~rowland/
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> 




More information about the mvapich-discuss mailing list