[mvapich-discuss] not being very graceful about running out of shared memory

Seger, Mark mark.seger at hp.com
Mon Dec 13 14:39:32 EST 2010


I have a very simply 'hello world' program that runs as a parallel job.  If before running it I allocate all of /dev/shm, rather than getting some sort of graceful error message I see the following in my job log:

+ srun --mpi=none /mscf/home/dmlb2000/mpi-tests/mvapich2/1.5.1p1/pathscale/3.2/hello
srun: error: cu01n81: tasks 8-15: Bus error
srun: error: cu01n82: tasks 16-23: Bus error
srun: error: cu01n83: tasks 24-31: Bus error
srun: error: cu01n80: tasks 0-7: Bus error

is this a known problem?  Something that will be addressed in a future release?
As you can see the test about was with mvapich2/1.5.1p1 but the app has also been built/run against 1.6rc1 and failed the same way.

-mark

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20101213/4a83dc5f/attachment-0001.html


More information about the mvapich-discuss mailing list