[mvapich-discuss] mvapich2-0.9.8 blacs problems

Bas van der Vlies basv at sara.nl
Thu Mar 22 05:55:50 EDT 2007



Forgot to attach the files :-(


Bas van der Vlies wrote:
> Hello,
> 
>  We have made a two simpler programs that does not use scalapack/blacs 
> and also shows the same behavior. See attachments.
> 
> Here are the result:
>  mvapich 0.9.8: No problems
>  mvapich 0.9.9 trunk: see below for errors
>  mvapich2 0.9.8 : see below for errors
> 
> 
> Regards and Hope this helps with diagnosing the problem
> 
> =====================================================================
>  mvapich 0.9.9 trunk:
> duptest:
> ====================================================
> Running with 8 processes
> will do 100000 dups and frees
> ............................................0 - <NO ERROR MESSAGE> : 
> Pointer conversions exhausted
> Too many MPI objects may have been passed to/from Fortran
> without being freed
> [0] [] Aborting Program!
> 4 - <NO ERROR MESSAGE> : Pointer conversions exhausted
> Too many MPI objects may have been passed to/from Fortran
> without being freed
> 2 - <NO ERROR MESSAGE> : Pointer conversions exhausted
> Too many MPI objects may have been passed to/from Fortran
> without being freed
> 6 - <NO ERROR MESSAGE> : Pointer conversions exhausted
> Too many MPI objects may have been passed to/from Fortran
> without being freed
> mpirun_rsh: Abort signaled from [0]
> [4] [] Aborting Program!
> [2] [] Aborting Program!
> [6] [] Aborting Program!
> done.
> ====================================================
> 
> splittest:
> ====================================================
> bas at ib-r21n1:~/src/applications$ mpirun -np 8 ./a.out
> 
> Running with 8 processes
> will do 100000 splits and frees
> ......................................0 - <NO ERROR MESSAGE> : Pointer 
> conversions exhausted
> Too many MPI objects may have been passed to/from Fortran
> without being freed
> [0] [] Aborting Program!
> 6 - <NO ERROR MESSAGE> : Pointer conversions exhausted
> Too many MPI objects may have been passed to/from Fortran
> without being freed
> 2 - <NO ERROR MESSAGE> : Pointer conversions exhausted
> Too many MPI objects may have been passed to/from Fortran
> without being freed
> 4 - <NO ERROR MESSAGE> : Pointer conversions exhausted
> Too many MPI objects may have been passed to/from Fortran
> without being freed
> mpirun_rsh: Abort signaled from [0]
> [6] [] Aborting Program!
> [2] [] Aborting Program!
> [4] [] Aborting Program!
> done.
> ====================================================
> 
> 
> mvapich2 0.9.8:
> 
> duptest:
> ====================================================
> as at ib-r21n1:~/src/applications$ mpiexec -n $nprocs  ./a.out
> Running with 8 processes
> will do 100000 dups and frees
> .Fatal error in MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000002, color=0, key=0, 
> new_comm=0xb7f4d8a4) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in 
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000001, color=1, key=1, 
> new_comm=0xb7f7a7bc) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in 
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000002, color=2, key=0, 
> new_comm=0xb7f778a4) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in 
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000001, color=2, key=1, 
> new_comm=0xb7ecf7bc) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in 
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000002, color=3, key=0, 
> new_comm=0xb7f398a4) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in 
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000001, color=3, key=1, 
> new_comm=0xb7f447bc) failed
> MPIR_Comm_create(90): Too many communicatorsrank 7 in job 1 
> ib-r21n1.irc.sara.nl_8763   caused collective abort of all ranks
>   exit status of rank 7: killed by signal 9
> rank 6 in job 1  ib-r21n1.irc.sara.nl_8763   caused collective abort of 
> all ranks
>   exit status of rank 6: killed by signal 9
> Fatal error in MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000002, color=1, key=0, 
> new_comm=0xb7f708a4) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in 
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000001, color=0, key=1, 
> new_comm=0xb7f4d7bc) failed
> MPIR_Comm_create(90): Too many communicatorsrank 5 in job 1 
> ib-r21n1.irc.sara.nl_8763   caused collective abort of all ranks
>   exit status of rank 5: return code 13
> rank 4 in job 1  ib-r21n1.irc.sara.nl_8763   caused collective abort of 
> all ranks
>   exit status of rank 4: killed by signal 9
> ====================================================
> 
> splitest:
> ====================================================
> Running with 8 processes
> will do 100000 splits and frees
> .Fatal error in MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000002, color=0, key=0, 
> new_comm=0xb7f2b8a4) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in 
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000002, color=0, key=0, 
> new_comm=0xb7f258a4) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in 
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000002, color=1, key=0, 
> new_comm=0xb7f168a4) failed
> MPIR_Comm_create(90): Too many communicatorsFatal error in 
> MPI_Comm_split: Other MPI error, error stack:
> MPI_Comm_split(290).: MPI_Comm_split(comm=0x84000002, color=1, key=0, 
> new_comm=0xb7f328a4) failed
> MPIR_Comm_create(90): Too many communicatorsrank 2 in job 3 
> ib-r21n1.irc.sara.nl_8763   caused collective abort of all ranks
>   exit status of rank 2: killed by signal 9
> rank 1 in job 3  ib-r21n1.irc.sara.nl_8763   caused collective abort of 
> all ranks
>   exit status of rank 1: killed by signal 9
> rank 0 in job 3  ib-r21n1.irc.sara.nl_8763   caused collective abort of 
> all ranks
>   exit status of rank 0: killed by signal 9
> ====================================================


-- 
********************************************************************
*                                                                  *
*  Bas van der Vlies                     e-mail: basv at sara.nl      *
*  SARA - Academic Computing Services    phone:  +31 20 592 8012   *
*  Kruislaan 415                         fax:    +31 20 6683167    *
*  1098 SJ Amsterdam                                               *
*                                                                  *
********************************************************************
-------------- next part --------------
A non-text attachment was scrubbed...
Name: duptest.c
Type: text/x-csrc
Size: 597 bytes
Desc: not available
Url : http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20070322/29f3a218/duptest.bin
-------------- next part --------------
A non-text attachment was scrubbed...
Name: splittest.c
Type: text/x-csrc
Size: 615 bytes
Desc: not available
Url : http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20070322/29f3a218/splittest.bin


More information about the mvapich-discuss mailing list