[mvapich-discuss] mvapich2 1.6 cannot run job on many nodes

Fri Jul 15 06:39:19 EDT 2011

 Hi again,                                                                                                                             

it seems I found problem or at least localized it.                                                                                    
There is one node which when I submit job there produces these errors.                                                                
In case of cpi code it happens when this node is in list of more than 8 nodes.                                                        
For other codes it happen just for two nodes (as for mvapich2 1.6 as for 1.0.3)                                                       
Any codes work well when jobs run only on this problem node.                                                                          

I try to play with this node and network using osu benchmarks, for example osu_bibw                                                   

mvapich2-1.6 passed without any problems                                                                                              
mvapich2-1.0.3 with osu mpiexec & torque shows errors:                                                                                
send desc error                                                                                                                       
[0] Abort: [] Got completion with error 9, vendor code=8a, dest rank=1                                                                
at line 519 in file ibv_channel_manager.c                                                                                             
[1] Abort: Got FATAL event 3                                                                                                          
at line 796 in file ibv_channel_manager.c                                                                                             

mvapich1-1.0.1 passed without any errors                                                                                              
openmpi-1.4.3 with -mca btl self,openib passed without errors.                                                                        

So, is it IB problem? In this case why does it just happen only with mvapich2?                                                        

I tested IB card on this node by standard ib_rdma/read/send_bw/lat and it seems to work.                                              

Thnax,                                                                                                                                
Egor.

> I have 8 cores per node. Half of nodes have 16GB RAM, half of them have 32GB.
> CPU are
> Intel(R) Xeon(R) E5410 @ 2.33GHz
> Intel(R) Xeon(R) E5472 @ 3.00GHz
> Intel(R) Xeon(R) E5620 @ 2.40GHz
> OFED 1.3.1-rc2 and CentOS 5 with kernel 2.6.18-53.1.21.el5.
> 
> ulimit -a on the all nodes:
> core file size (blocks, -c) 0
> data seg size (kbytes, -d) unlimited
> max nice (-e) 0
> file size (blocks, -f) unlimited
> pending signals (-i) 139264
> max locked memory (kbytes, -l) unlimited
> max memory size (kbytes, -m) unlimited
> open files (-n) 1024
> pipe size (512 bytes, -p) 8
> POSIX message queues (bytes, -q) 819200
> max rt priority (-r) 0
> stack size (kbytes, -s) 10240
> cpu time (seconds, -t) unlimited
> max user processes (-u) 139264
> virtual memory (kbytes, -v) unlimited
> file locks (-x) unlimited
> 
> the same problem I have as for gcc as for intel 10.1 compilers.
> 
> thnax,
> Egor.
> 
> > I've used the same configuration options but I have not been
> > able to reproduce this problem. I've used varying number of cores
> > (focusing on 321 and 512 cores), while running cpi and osu_mbw_mr with
> > mpirun_rsh and hydra (mpiexec). Perhaps there is some missing
> > information I need to reproduce this. How many cores per machine are
> > you using? Perhaps a certain machine triggers the problem. Can you
> > tell us what cpu and how much memory each machine has? Thanks in
> > advance.
> > 
> > 2011/7/14 <worldeb at ukr.net>:
> > >
> > > Hi folks,
> > >
> > > mvapich2-1.6-r4751
> > > gcc (GCC) 4.1.2 20070626 (Red Hat 4.1.2-14)
> > > InfiniBand: Mellanox Technologies MT25204
> > > torque 2.1.8
> > >
> > > ./configure --prefix=/usr/mpi/gcc/mvapich2-1.6.0 --enable-f77 --enable-f90 --enable-cxx --enable-debuginfo --enable-smpcoll --enable-async-progress --enable-threads=default --with-hwloc --with-device=ch3:nemesis:ib --enable-sharedlibs=gcc --enable-romio
> > >
> > > Cannot run jobs on many nodes (for examples >320 cores) as using batch system with mpiexec osu or native mpiexec as submiting them directly by mpiexec.hydra or mpirun_rsh.
> > > Actually this number of 320 cores is not fixed. It change from time to time but mpirun_rsh submit jobs successfully on the less nodes exactly.
> > >
> > > I try to play only with simple codes like "hello word" on each cpu or even with cpi from examples or osu_benchmarks.
> > >
> > > Errors are like:
> > >
> > > mpiexec.hydra -n 321 -f HOSTFILE ./test_mvapich2_gcc-1.6.0
> > >
> > > Fatal error in MPI_Init: Internal MPI error!, error stack:
> > > MPIR_Init_thread(413): Initialization failed
> > > (unknown)(): Internal MPI error!
> > > =====================================================================================
> > > = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> > > = EXIT CODE: 256
> > > = CLEANING UP REMAINING PROCESSES
> > > = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
> > > =====================================================================================
> > > [proxy:0:0 at node01] HYD_pmcd_pmip_control_cmd_cb (./pm/pmiserv/pmip_cb.c:906): assert (!closed) failed
> > > [proxy:0:0 at node01] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
> > > [proxy:0:0 at node01].ac.at] main (./pm/pmiserv/pmip.c:214): demux engine error waiting for event
> > > [mpiexec at head] HYDT_bscu_wait_for_completion (./tools/bootstrap/utils/bscu_wait.c:70): one of the processes terminated badly; aborting
> > > [mpiexec at head] HYDT_bsci_wait_for_completion (./tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting for completion
> > > [mpiexec at head] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:199): launcher returned error waiting for completion
> > > [mpiexec at head] main (./ui/mpich/mpiexec.c:385): process manager error waiting for completion
> > >
> > >
> > > I have no problem with the same codes but compiled by last openmpi with IB and calculated on all nodes.
> > >
> > > Any suggestions what was a problem and how to solve it.