[mvapich-discuss] help: Poll CQ failed!

Jeff Haferman jeff at haferman.com
Wed Dec 30 21:21:33 EST 2009


Well, we updated to OFED 1.4.1-4, and tried the mvapich (1.1.0) and 
openmpi (1.2.8) supplied with it.  I'm still seeing the same problems.

With mvapich I do
mpirun -np 16 -machinefile ./hostfile.ib ./cpi
which SOMETIMES bombs with
Abort signaled by rank 12: Exit code -3 signaled from compute-ib-1-1
Killing remote processes...[compute-1-1.local:12] Got error polling CQ

with openmpi I do
mpirun --mca btl openib,self -np 16 --hostfile hostfile.orte ./cpi
which SOMETIMES bombs with
[compute-ib-1-0][0,1,1][btl_openib_component.c:1357:btl_openib_component_progress]
error polling HP CQ with -2 errno says Success

These machines have ethernet interfaces and I can run mpich / openmpi
fine over the ethernet interfaces.  Sometimes the IB runs work, the IB
runs always work if I run on a single node (each has 8 cores).  If I run
between 2 IB nodes it usually works but sometimes bombs.  More than 2
nodes and it usually does not work.  I noticed that my first run of the
day will usually run fine, so it almost tells me that something is not
being cleaned up, but "ipcs" shows everything to be clean.

As I said, with the new OFED, I used the stacks provided with the OFED,
I'll try compiling my own tomorrow, but any ideas would be appreciated.

This is new IB hardware from Sun, the ibdiagnet tests don't show any
problems, but I don't know quite what to do from here.

Jeff


Dhabaleswar Panda wrote:
> Let us know what you observe with mvapich 1.1 branchh version.
> 
> I also notice that you are using OFED 1.3.1. This is an older version.
> Since you are using QDR switch (with DDR adapters), you may try to update
> your system to the latest OFED 1.4.* version. The GA rellease of OFED 1.5
> will also be coming out soon and you can use it.
> 
> DK
> 
> On Tue, 29 Dec 2009, Jeff Haferman wrote:
> 
>> Hi DP
>>
>> All of our hardware is Sun (and to my knowledge are manufactured by
>> mellanox):
>> Switch:  part #X2821A-Z  36-port QDR switch
>> HCAS: part #X4216A-Z dual-port DDR PCI-E IB HCA
>>
>> Network cables are definitely connected tightly, we've double-checked this.
>> OFED version is 1.3.1.
>> The ethernet connections have their own separate NICs, they are on the 172
>> subnet, the IB interfaces are on the 10 subnet. We've been running other MPI
>> stacks over ethernet for a year and do most of our work over the ethernet
>> interfaces, so I feel pretty good about that.  We've been running lustre
>> over IB, and it seems to be working.
>>
>> My configure line for MVAPICH2 looks like:
>> ./configure --with-rdma=gen2 --with-arch=LINUX -prefix=${PREFIX} \
>>        --enable-cxx --enable-debug \
>>        --enable-devdebug \
>>        --enable-f77 --enable-f90 \
>>        --enable-romio \
>>        --with-file-system=lustre+nfs \
>>        --with-link=DDR
>>
>> and I also have set the following in my environment:
>> export CC=pgcc
>> export CXX=pgCC
>> export F77=pgf90
>> export F90=pgf90
>> export RSHCOMMAND=ssh
>>
>>
>> Any ideas?  I will try MVAPICH 1.1.1 later today but perhaps you see
>> something obvious in my configuration.
>>
>> Jeff
>>
>>
>> On Mon, Dec 28, 2009 at 10:27 PM, Dhabaleswar Panda <
>> panda at cse.ohio-state.edu> wrote:
>>
>> > Hi Jeff,
>> >
>> > Thanks for your report. This seems to be some kind of
>> > systems-related/set-up issues. Could you let us know what kind of adapters
>> > and switch you are using. Are all the network cables connected tightly?
>> > Which OFED version is being used? How is your Ethernet connections set-up
>> > for the nodes in the cluster? The mpirun_rsh job-startup framework uses
>> > TCP/IP initially to set-up connections.
>> >
>> > Also, are you configuring mvapich and mvapich2 properly, i.e., using the
>> > OpenFabrics-Gen2 interface? Mvapich 1.0.1 stack is very old. Please use
>> > the latest MVAPICH 1.1 branch version.
>> >
>> > http://mvapich.cse.ohio-state.edu/nightly/mvapich/branches/1.1/
>> >
>> > If you have multi-core nodes (say 8/16 cores per node), you can try
>> > running mvapich1 or mvapich2 with 8/16 MPI processes on a given node
>> > first. Then you can try running the same number of MPI processes across
>> > multiple nodes (say using 8 MPI processes on 8 nodes using 1
>> > process/node). Then you can run experiments involving multiple cores and
>> > nodes. Running such separate tests will help you to isolate the problem on
>> > your set-up and correct it.
>> >
>> > Thanks,
>> >
>> > DK
>> >
>> >
>> > On Mon, 28 Dec 2009, Jeff Haferman wrote:
>> >
>> > >
>> > > I've built four mvapich 1.0.1 stacks (PGI, gnu, intel, sun) and
>> > > one mvapich 2.1.4 stack (PGI) and I'm getting the same problem with all
>> > of
>> > > them just running the simple "cpi" test:
>> > >
>> > > With mvapich1:
>> > > mpirun -np 16 -machinefile ./hostfile.16 ./cpi
>> > > Abort signaled by rank 6: Error polling CQ
>> > > MPI process terminated unexpectedly
>> > > Signal 15 received.
>> > > DONE
>> > >
>> > > With mvapich2:
>> > > mpirun_rsh -ssh -np 3 -hostfile ./hostfile.16 ./cpi
>> > > Fatal error in MPI_Init:
>> > > Internal MPI error!, error stack:
>> > > MPIR_Init_thread(311).........: Initialization failed
>> > > MPID_Init(191)................: channel initialization failed
>> > > MPIDI_CH3_Init(163)...........:
>> > > MPIDI_CH3I_RDMA_init(190).....:
>> > > rdma_ring_based_allgather(545): Poll CQ failed!
>> > >
>> > >
>> > > The INTERESTING thing is that sometimes these run successfully!  They
>> > > almost always run with 2-4 processors, but generally fail with more than
>> > > 4 processors (and my hostfile is setup to ensure that the processors are
>> > > on physically separate nodes).  Today I've actually had a hard time
>> > > getting mvapich1 to fail with any number of processors.
>> > >
>> > > The ibdiagnet tests show no problems.
>> > >
>> > > Where do I go from here?
>> > >
>> > > Jeff
>> > >
>> > > _______________________________________________
>> > > mvapich-discuss mailing list
>> > > mvapich-discuss at cse.ohio-state.edu
>> > > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>> > >
>> >
>> >
>>
> 



More information about the mvapich-discuss mailing list