[mvapich-discuss] how mvapich2 works on mips

Wang Xiyue zerocain at gmail.com
Fri Dec 23 02:56:16 EST 2011


Hi, all:
    currently i met some problems when i use mvapich2 on mips machine, 
any sugesstion will be welcome.

     the cpu is based on mips architecture. each board have 2 cpus, each 
cpu have 4 cores.
     each cpu use 2 memory stick(ddr2, 2g for each), so each node has 4g 
memory space, and the board has 8g memory space in total.
     and we use the kernel which we made some modifies(the version is 
2.6.36), and the HCA cards which we use is mlx4.
     we use OFED-1.5.3-rc2 for userspace, but since our OS is debian, so 
we can't just install OFED with its installation script. libibvers, 
librdmacm, libibumad, libmlx4 and libibcm are picked up from 
OFED-1.5.3-rc2, and libibcommon's version is 1.1.2, i got the source 
code with apt-get source.
      oh, almost forgot, we use mvapich2-1.6.

     now here's the question:
     1. when we use SMP, the mpi works fine with mpiexec:
         mpiexec -n 16 ./cpi
         but when use mpirun_rsh, the cpi hangs in MPI_Init:
         mpirun_rsh -np 16 -hostfile ./host ./cpi

         cat ./host:
         inode1
         inode2

         and i reduce the number of processes to 2, mpirun_rsh still hangs.

     2. when we use NUMA, the mpi sometimes works fine, but mostly it 
doesn't work. almost one out of thirty will works fine.
         there're 2 situation:
         (1). hangs in MPI_Init with mpiexec;
         (2). it says poll cq failed:
                 rdma_ring_based_allgather: Poll CQ failed

         i compiled mpi with:
         ./configure  --libdir=/usr    -with-rdma=gen2    
-with-ib-libpath=/usr    --enable-g=dbg    --enable-fast=none    
--disable-cxx       CFLAGS="-Wall -g"


         any suggestion?
         thanx
                                                                         
                                                                 Celia.Wang


More information about the mvapich-discuss mailing list