[mvapich-discuss] Bus error from mpirun_rsh
Dan Kokron
dkokron at toad.net
Sat Nov 8 20:36:40 EST 2008
Following is a minimal trace from a core file dropped by mpirun_rsh
while attempting to run an application on 120 nodes of an SGI ICE
system. The application and MVAPICH were both compiled with the
Intel-10.1.015 compiler suite. Does this look familiar to anyone?
Some system config
Linux p4fe1 2.6.16.60-0.27schamp-nasa #1 SMP Sat Sep 13 20:37:07 UTC
2008 x86_64 x86_64 x86_64 GNU/Linux
Command line
mpirun_rsh -ssh -np 480 -hostfile machinefile VIADEV_USE_SHMEM_COLL=0
VIADEV_CLUSTER_SIZE=MEDIUM ./Application.x
which mpirun_rsh
/u/dkokron/play/mvapich-1.1rc1/bin/mpirun_rsh
p4fe1.dkokron 269> gdb -c core.10742
/u/dkokron/play/mvapich-1.1rc1/bin/mpirun_rsh
GNU gdb 6.6
Copyright (C) 2006 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB. Type "show warranty" for details.
This GDB was configured as "x86_64-suse-linux"...
Using host libthread_db library "/lib64/libthread_db.so.1".
Reading symbols from /nasa/intel/cce/10.1.015/lib/libimf.so...done.
Loaded symbols for /nasa/intel/cce/10.1.015/lib/libimf.so
Reading symbols from /lib64/libm.so.6...done.
Loaded symbols for /lib64/libm.so.6
Reading symbols from /lib64/libgcc_s.so.1...done.
Loaded symbols for /lib64/libgcc_s.so.1
Reading symbols from /lib64/libc.so.6...done.
Loaded symbols for /lib64/libc.so.6
Reading symbols from /lib64/libdl.so.2...done.
Loaded symbols for /lib64/libdl.so.2
Reading symbols from /lib64/ld-linux-x86-64.so.2...done.
Loaded symbols for /lib64/ld-linux-x86-64.so.2
Core was generated by `mpirun_rsh -ssh -np 480 -hostfile machinefile
VIADEV_USE_SHMEM_COLL=0 VIADEV_CL'.
Program terminated with signal 7, Bus error.
#0 0x0000000000402b15 in main ()
--
Dan Kokron
More information about the mvapich-discuss
mailing list