[mvapich-discuss] mvapich2-0.9.8 blacs problems
Bas van der Vlies
basv at sara.nl
Thu Mar 22 05:55:50 EDT 2007
Forgot to attach the files :-(
Bas van der Vlies wrote:
> Hello,
>
> We have made a two simpler programs that does not use scalapack/blacs
> and also shows the same behavior. See attachments.
>
> Here are the result:
> mvapich 0.9.8: No problems
> mvapich 0.9.9 trunk: see below for errors
> mvapich2 0.9.8 : see below for errors
>
>
> Regards and Hope this helps with diagnosing the problem
>
> =====================================================================
> mvapich 0.9.9 trunk:
> duptest:
> ====================================================
> Running with 8 processes
> will do 100000 dups and frees
> ............................................0 - <NO ERROR MESSAGE> :
> Pointer conversions exhausted
> Too many MPI objects may have been passed to/from Fortran
> without being freed
> [0] [] Aborting Program!
> 4 - <NO ERROR MESSAGE> : Pointer conversions exhausted
> Too many MPI objects may have been passed to/from Fortran
> without being freed
> 2 - <NO ERROR MESSAGE> : Pointer conversions exhausted
> Too many MPI objects may have been passed to/from Fortran
> without being freed
> 6 - <NO ERROR MESSAGE> : Pointer conversions exhausted
> Too many MPI objects may have been passed to/from Fortran
> without being freed
> mpirun_rsh: Abort signaled from [0]
> [4] [] Aborting Program!
> [2] [] Aborting Program!
> [6] [] Aborting Program!
> done.
> ====================================================
>
> splittest:
> ====================================================
> bas at ib-r21n1:~/src/applications$ mpirun -np 8 ./a.out
>
> Running with 8 processes
> will do 100000 splits and frees
> ......................................0 - <NO ERROR MESSAGE> : Pointer
> conversions exhausted
> Too many MPI objects may have been passed to/from Fortran
> without being freed
> [0] [] Aborting Program!
> 6 - <NO ERROR MESSAGE> : Pointer conversions exhausted
> Too many MPI objects may have been passed to/from Fortran
> without being freed
> 2 - <NO ERROR MESSAGE> : Pointer conversions exhausted
> Too many MPI objects may have been passed to/from Fortran
> without being freed
> 4 - <NO ERROR MESSAGE> : Pointer conversions exhausted
> Too many MPI objects may have been passed to/from Fortran
> without being freed
> mpirun_rsh: Abort signaled from [0]
> [6] [] Aborting Program!
> [2] [] Aborting Program!
> [4] [] Aborting Program!
> done.
> ====================================================
>
>
> mvapich2 0.9.8:
>
> duptest:
> ====================================================
> as at ib-r21n1:~/src/applications$ mpiexec -n $nprocs ./a.out
> Running with 8 processes
> will do 100000 dups and frees
> .Fatal error in MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000002, color=0, key=0,
> new_comm=0xb7f4d8a4) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000001, color=1, key=1,
> new_comm=0xb7f7a7bc) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000002, color=2, key=0,
> new_comm=0xb7f778a4) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000001, color=2, key=1,
> new_comm=0xb7ecf7bc) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000002, color=3, key=0,
> new_comm=0xb7f398a4) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000001, color=3, key=1,
> new_comm=0xb7f447bc) failed
> MPIR_Comm_create(90): Too many communicatorsrank 7 in job 1
> ib-r21n1.irc.sara.nl_8763 caused collective abort of all ranks
> exit status of rank 7: killed by signal 9
> rank 6 in job 1 ib-r21n1.irc.sara.nl_8763 caused collective abort of
> all ranks
> exit status of rank 6: killed by signal 9
> Fatal error in MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000002, color=1, key=0,
> new_comm=0xb7f708a4) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000001, color=0, key=1,
> new_comm=0xb7f4d7bc) failed
> MPIR_Comm_create(90): Too many communicatorsrank 5 in job 1
> ib-r21n1.irc.sara.nl_8763 caused collective abort of all ranks
> exit status of rank 5: return code 13
> rank 4 in job 1 ib-r21n1.irc.sara.nl_8763 caused collective abort of
> all ranks
> exit status of rank 4: killed by signal 9
> ====================================================
>
> splitest:
> ====================================================
> Running with 8 processes
> will do 100000 splits and frees
> .Fatal error in MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000002, color=0, key=0,
> new_comm=0xb7f2b8a4) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000002, color=0, key=0,
> new_comm=0xb7f258a4) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000002, color=1, key=0,
> new_comm=0xb7f168a4) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000002, color=1, key=0,
> new_comm=0xb7f328a4) failed
> MPIR_Comm_create(90): Too many communicatorsrank 2 in job 3
> ib-r21n1.irc.sara.nl_8763 caused collective abort of all ranks
> exit status of rank 2: killed by signal 9
> rank 1 in job 3 ib-r21n1.irc.sara.nl_8763 caused collective abort of
> all ranks
> exit status of rank 1: killed by signal 9
> rank 0 in job 3 ib-r21n1.irc.sara.nl_8763 caused collective abort of
> all ranks
> exit status of rank 0: killed by signal 9
> ====================================================
--
********************************************************************
* *
* Bas van der Vlies e-mail: basv at sara.nl *
* SARA - Academic Computing Services phone: +31 20 592 8012 *
* Kruislaan 415 fax: +31 20 6683167 *
* 1098 SJ Amsterdam *
* *
********************************************************************
-------------- next part --------------
A non-text attachment was scrubbed...
Name: duptest.c
Type: text/x-csrc
Size: 597 bytes
Desc: not available
Url : http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20070322/29f3a218/duptest.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: splittest.c
Type: text/x-csrc
Size: 615 bytes
Desc: not available
Url : http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20070322/29f3a218/splittest.bin
More information about the mvapich-discuss
mailing list