[mvapich-discuss] not being very graceful about running out of shared memory

Jonathan Perkins perkinjo at cse.ohio-state.edu
Mon Dec 13 15:42:33 EST 2010


Hi Mark.  Thanks for your note.  We are aware of this current
limitation and plan add a graceful error message in a future release.

On Mon, Dec 13, 2010 at 2:39 PM, Seger, Mark <mark.seger at hp.com> wrote:
> I have a very simply ‘hello world’ program that runs as a parallel job.  If
> before running it I allocate all of /dev/shm, rather than getting some sort
> of graceful error message I see the following in my job log:
>
>
>
> + srun --mpi=none
> /mscf/home/dmlb2000/mpi-tests/mvapich2/1.5.1p1/pathscale/3.2/hello
>
> srun: error: cu01n81: tasks 8-15: Bus error
>
> srun: error: cu01n82: tasks 16-23: Bus error
>
> srun: error: cu01n83: tasks 24-31: Bus error
>
> srun: error: cu01n80: tasks 0-7: Bus error
>
>
>
> is this a known problem?  Something that will be addressed in a future
> release?
>
> As you can see the test about was with mvapich2/1.5.1p1 but the app has also
> been built/run against 1.6rc1 and failed the same way.
>
>
>
> -mark
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>



-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo



More information about the mvapich-discuss mailing list