[mvapich-discuss] hang at large numbers of processors

Mon Nov 3 14:17:46 EST 2008

We are running into hangs on Ranger using mvapich that are not present 
on other machines.  These hangs seem to only occur on arge problems with 
large numbers of processors.  We have ran into similar problems on some 
LLNL machines in the past and were able to get around them by disabling 
the shared memory optimizations.  In these cases the problem had to do 
with fixed sized buffers used in the shared memory optimizations. 

We would like to disable shared memory on Ranger but are confused with 
all the different parameters dealing with shared memory optimizations.  
How do we know which parameters affect the run?  For example do we use 
the parameters that begin with MV_ or VIADEV_?  From past conversations 
I have had with support teams the parameters that have an effect vary 
according to the hardware/mpi build.  What is the best way to determine 
which parameters are active?

Also here is a stacktrace from one of our hangs:

.stack.i132-112.ranger.tacc.utexas.edu.16033
Intel(R) Debugger for applications running on Intel(R) 64, Version 
10.1-35 , Build 20080310
Attaching to program: 
/work/00975/luitjens/SCIRun/optimized/Packages/Uintah/StandAlone/sus, 
process 16033
Reading symbols from 
/work/00975/luitjens/SCIRun/optimized/Packages/Uintah/StandAlone/sus...(no 
debugging symbols found)...done.
smpi_net_lookup () at mpid_smpi.c:1381
#0  0x00002ada6b4d8510 in smpi_net_lookup () at mpid_smpi.c:1381
#1  0x00002ada6b4d8414 in MPID_SMP_Check_incoming () at mpid_smpi.c:1360
#2  0x00002ada6b4f293c in MPID_DeviceCheck (blocking=7154160) at 
viacheck.c:505
#3  0x00002ada6b4d600b in MPID_RecvComplete (request=0x6d29f0, 
status=0x10, error_code=0x4) at mpid_recv.c:106
#4  0x00002ada6b4fe2f7 in MPI_Waitall (count=7154160, 
array_of_requests=0x10, array_of_statuses=0x4) at waitall.c:190
#5  0x00002ada6b4e6d3c in MPI_Sendrecv (sendbuf=0x6d29f0, sendcount=16, 
sendtype=4, dest=14, sendtag=22045696, recvbuf=0x1506680, recvcount=1, 
recvtype=6, source=2278, recvtag=14, comm=130, status=0x7fff4385028c) at 
sendrecv.c:98
#6  0x00002ada6b4c4d2d in intra_Allreduce (sendbuf=0x6d29f0, 
recvbuf=0x10, count=4, datatype=0xe, op=22045696, comm=0x1506680) at 
intra_fns_new.c:5682
#7  0x00002ada6b4c4516 in intra_shmem_Allreduce (sendbuf=0x6d29f0, 
recvbuf=0x10, count=1, datatype=0xe, op=22045696, comm=0x1506680) at 
intra_fns_new.c:6014
#8  0x00002ada6b48f286 in MPI_Allreduce (sendbuf=0x6d29f0, recvbuf=0x10, 
count=4, datatype=14, op=22045696, comm=22046336) at allreduce.c:83
#9  0x00002ada6a67a4f8 in _ZN6Uintah12MPIScheduler7executeEii () in 
/work/00975/luitjens/SCIRun/optimized/lib/libPackages_Uintah_CCA_Components_Schedulers.so

In this case what would be the likely parameter I could play with in 
order to potentially stop a hang in MPI_Allreduce?

Thanks,
Justin