[mvapich-discuss] mvapich2-0.9.8 blacs problems (Another)
amith rajith mamidala
mamidala at cse.ohio-state.edu
Mon Mar 26 15:46:37 EDT 2007
Hi Bas,
Attached is the patch which resolves this error with respect to the
number of communicators/groups that are created. The
program runs fine with the attached patch. We are also investigating this in more depth and
will get back to you if we see any more issues,
Thanks,
Amith
On Mon, 26 Mar 2007, Bas van der Vlies wrote:
> Hello,
>
> We still have problems with mvapich2 0.9.8 + patches and blacs. Here
> is another file attached, build/run command:
>
> {{{
> mpif90 -o pdgemr2dtest.$brand -ff2c -Wall -g pdgemr2dtest.f90
> -lscalapack -lfblacs -lcblacs -lblacs -llapack -latlas
>
> echo 310 16 1000 | mpiexec -n $nprocs <program_name>
> }}}
>
>
> The problems always occurs if we do many loops. It will consume more and
> more memory and then it crash with the following error:
> {{{
>
> loop n mb nprocs npcol nprow 83 310 16 8 4 2
> loop n mb nprocs npcol nprow 84 310 16 8 4 2
> loop n mb nprocs npcol nprow 85 310 16 8 4 2
> rank 7 in job 1 ib-r6n18.irc.sara.nl_7000 caused collective abort of
> all ranks
> exit status of rank 7: killed by signal 9
> rank 6 in job 1 ib-r6n18.irc.sara.nl_7000 caused collective abort of
> all ranks
> exit status of rank 6: killed by signal 9
> rank 4 in job 1 ib-r6n18.irc.sara.nl_7000 caused collective abort of
> all ranks
> exit status of rank 4: killed by signal 9
> }}}
>
>
>
> --
> ********************************************************************
> * *
> * Bas van der Vlies e-mail: basv at sara.nl *
> * SARA - Academic Computing Services phone: +31 20 592 8012 *
> * Kruislaan 415 fax: +31 20 6683167 *
> * 1098 SJ Amsterdam *
> * *
> ********************************************************************
>
-------------- next part --------------
Index: create_2level_comm.c
===================================================================
--- create_2level_comm.c (revision 1120)
+++ create_2level_comm.c (working copy)
@@ -163,6 +163,8 @@
else{
comm_ptr->shmem_coll_ok = 0;
free_2level_comm(comm_ptr);
+ MPI_Group_free(&subgroup1);
+ MPI_Group_free(&comm_group);
}
More information about the mvapich-discuss
mailing list