[mvapich-discuss] MVAPICH2-0.9.8 internal errors

Dhabaleswar Panda panda at cse.ohio-state.edu
Thu Mar 22 11:37:56 EDT 2007


Hi David, 

Thanks for this information. One more question - are you using
MVAPICH2 0.9.8 `released' version or the `branch' version (with some
recent fixes). If you can let us know this information, it will help
us.

Thanks, 

DK


> I have recompiled mvapich2 without using the --enable-debuginfo flag  
> and the problem has gone away.  However, I wish to have debuginfo  
> available to our TotalView users so hopefully this can be resolved.
> 
> Here is the configuration that generates the error message I saw  
> previously:
> 
> ./configure --prefix=/opt/mvapich2/mvapich2-0.9.8-gcc/ib --with- 
> openib=/usr/local/ofed --enable-romio --with-file-system=ufs+nfs 
> +panfs --enable-sharedlibs=gcc --enable-debuginfo --enable-fast -- 
> with-mpe
> 
> OFED is ofed-1.1
> 
> CFLAGS=-D_X86_64_ -D_SMP_ -DUSE_HEADER_CACHING  -DONE_SIDED - 
> DMPID_USE_SEQUENCE_NUMBERS  -I/usr/local/ofed/include -O2   - 
> D_SHMEM_COLL_
> 
> CC=/usr/bin/gcc
> CXX=/usr/bin/g++
> FC=/usr/bin/gfortran
> F77=/usr/bin/gfortran
> F90=/usr/bin/gfortran
> 
> The test was run on 32 process, 16 process, 8 process and 4 process -  
> all of which generated this error message.
> 
> Thanks,
> david
> 
> 
> 
> --
> David Gunter
> HPC-4: HPC Environments: Parallel Tools Team
> Los Alamos National Laboratory
> 
> 
> On Mar 21, 2007, at 4:50 PM, wei huang wrote:
> 
> > Hi,
> >
> > Thanks for letting us know this problem.
> >
> > We will try to reproduce the problem on our cluster. To help us on  
> > looking
> > into this problem, would you please let us know the following:
> >
> > 1) The exact CFLAGS you are using when configuring mvapich2 (are  
> > you using
> > default scripts provided by us?)
> > 2) Any runtime environment variables you have setup up?
> > 3) On how many nodes do you run the test?
> >
> > Thanks.
> >
> > Regards,
> > Wei Huang
> >
> > 774 Dreese Lab, 2015 Neil Ave,
> > Dept. of Computer Science and Engineering
> > Ohio State University
> > OH 43210
> > Tel: (614)292-8501
> >
> >
> > On Wed, 21 Mar 2007, David Gunter wrote:
> >
> >> I have built mvapich2 for an OFED-based IB Opteron cluster.  When
> >> running the Intel MPI Benchmarks (IMB3) I keep seeing the following
> >> errors messages in many spots, although the tests run to completion:
> >>
> >> Internal error: communicator is already on free list
> >>
> >> What is this referring to?
> >>
> >> Thanks.
> >> --david
> >>
> >> --
> >> David Gunter
> >> HPC-4: HPC Environments: Parallel Tools Team
> >> Los Alamos National Laboratory
> >>
> >>
> >>
> >
> 
> 
> --Apple-Mail-1-849288998
> Content-Transfer-Encoding: quoted-printable
> Content-Type: text/html;
> 	charset=ISO-8859-1
> 
> <HTML><BODY style=3D"word-wrap: break-word; -khtml-nbsp-mode: space; =
> -khtml-line-break: after-white-space; ">I have recompiled mvapich2 =
> without using the --enable-debuginfo flag and the problem has gone =
> away.=A0 However, I wish to have debuginfo available to our TotalView =
> users so hopefully this can be resolved.<DIV><BR =
> class=3D"khtml-block-placeholder"></DIV><DIV>Here is the configuration =
> that generates the error message I saw previously:</DIV><DIV><BR =
> class=3D"khtml-block-placeholder"></DIV><DIV>./configure =
> --prefix=3D/opt/mvapich2/mvapich2-0.9.8-gcc/ib=A0--with-openib=3D/usr/loca=
> l/ofed --enable-romio --with-file-system=3Dufs+nfs+panfs =
> --enable-sharedlibs=3Dgcc --enable-debuginfo --enable-fast =
> --with-mpe</DIV><DIV><BR class=3D"khtml-block-placeholder"></DIV><DIV>OFED=
>  is ofed-1.1</DIV><DIV><BR =
> class=3D"khtml-block-placeholder"></DIV><DIV>CFLAGS=3D-D_X86_64_ -D_SMP_ =
> -DUSE_HEADER_CACHING=A0 -DONE_SIDED -DMPID_USE_SEQUENCE_NUMBERS=A0 =
> -I/usr/local/ofed/include -O2=A0=A0 -D_SHMEM_COLL_</DIV><DIV><BR =
> class=3D"khtml-block-placeholder"></DIV><DIV>CC=3D/usr/bin/gcc</DIV><DIV>C=
> XX=3D/usr/bin/g++</DIV><DIV>FC=3D/usr/bin/gfortran</DIV><DIV>F77=3D/usr/bi=
> n/gfortran</DIV><DIV>F90=3D/usr/bin/gfortran</DIV><DIV><BR =
> class=3D"khtml-block-placeholder"></DIV><DIV>The test was run on 32 =
> process, 16 process, 8 process and 4 process - all of which generated =
> this error message.</DIV><DIV><BR =
> class=3D"khtml-block-placeholder"></DIV><DIV>Thanks,</DIV><DIV>david</DIV>=
> <DIV><BR class=3D"khtml-block-placeholder"></DIV><DIV><BR =
> class=3D"khtml-block-placeholder"></DIV><DIV><BR><DIV> <SPAN =
> class=3D"Apple-style-span" style=3D"border-collapse: separate; =
> border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family: Helvetica; =
> font-size: 12px; font-style: normal; font-variant: normal; font-weight: =
> normal; letter-spacing: normal; line-height: normal; text-align: auto; =
> -khtml-text-decorations-in-effect: none; text-indent: 0px; =
> -apple-text-size-adjust: auto; text-transform: none; orphans: 2; =
> white-space: normal; widows: 2; word-spacing: 0px; "><SPAN =
> class=3D"Apple-style-span" style=3D"border-collapse: separate; =
> border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family: Helvetica; =
> font-size: 12px; font-style: normal; font-variant: normal; font-weight: =
> normal; letter-spacing: normal; line-height: normal; text-align: auto; =
> -khtml-text-decorations-in-effect: none; text-indent: 0px; =
> -apple-text-size-adjust: auto; text-transform: none; orphans: 2; =
> white-space: normal; widows: 2; word-spacing: 0px; "><SPAN =
> class=3D"Apple-style-span" style=3D"border-collapse: separate; =
> border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family: Helvetica; =
> font-size: 12px; font-style: normal; font-variant: normal; font-weight: =
> normal; letter-spacing: normal; line-height: normal; text-align: auto; =
> -khtml-text-decorations-in-effect: none; text-indent: 0px; =
> -apple-text-size-adjust: auto; text-transform: none; orphans: 2; =
> white-space: normal; widows: 2; word-spacing: 0px; "><SPAN =
> class=3D"Apple-style-span" style=3D"border-collapse: separate; =
> border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family: Helvetica; =
> font-size: 12px; font-style: normal; font-variant: normal; font-weight: =
> normal; letter-spacing: normal; line-height: normal; text-align: auto; =
> -khtml-text-decorations-in-effect: none; text-indent: 0px; =
> -apple-text-size-adjust: auto; text-transform: none; orphans: 2; =
> white-space: normal; widows: 2; word-spacing: 0px; "><SPAN =
> class=3D"Apple-style-span" style=3D"border-collapse: separate; =
> border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family: Helvetica; =
> font-size: 12px; font-style: normal; font-variant: normal; font-weight: =
> normal; letter-spacing: normal; line-height: normal; text-align: auto; =
> -khtml-text-decorations-in-effect: none; text-indent: 0px; =
> -apple-text-size-adjust: auto; text-transform: none; orphans: 2; =
> white-space: normal; widows: 2; word-spacing: 0px; =
> "><DIV>--</DIV><DIV>David Gunter</DIV><DIV>HPC-4: HPC Environments: =
> Parallel Tools Team</DIV><DIV>Los Alamos National =
> Laboratory</DIV></SPAN></SPAN><BR =
> class=3D"Apple-interchange-newline"></SPAN></SPAN></SPAN> =
> </DIV><BR><DIV><DIV>On Mar 21, 2007, at 4:50 PM, wei huang =
> wrote:</DIV><BR class=3D"Apple-interchange-newline"><BLOCKQUOTE =
> type=3D"cite"><DIV style=3D"margin-top: 0px; margin-right: 0px; =
> margin-bottom: 0px; margin-left: 0px; ">Hi,</DIV><DIV style=3D"margin-top:=
>  0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
> min-height: 14px; "><BR></DIV><DIV style=3D"margin-top: 0px; =
> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Thanks for =
> letting us know this problem.</DIV><DIV style=3D"margin-top: 0px; =
> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; min-height: =
> 14px; "><BR></DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
> margin-bottom: 0px; margin-left: 0px; ">We will try to reproduce the =
> problem on our cluster. To help us on looking</DIV><DIV =
> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
> margin-left: 0px; ">into this problem, would you please let us know the =
> following:</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></DIV><DIV =
> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
> margin-left: 0px; ">1) The exact CFLAGS you are using when configuring =
> mvapich2 (are you using</DIV><DIV style=3D"margin-top: 0px; =
> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">default =
> scripts provided by us?)</DIV><DIV style=3D"margin-top: 0px; =
> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">2) Any =
> runtime environment variables you have setup up?</DIV><DIV =
> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
> margin-left: 0px; ">3) On how many nodes do you run the test?</DIV><DIV =
> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
> margin-left: 0px; min-height: 14px; "><BR></DIV><DIV style=3D"margin-top: =
> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
> ">Thanks.</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></DIV><DIV =
> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
> margin-left: 0px; ">Regards,</DIV><DIV style=3D"margin-top: 0px; =
> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Wei =
> Huang</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></DIV><DIV =
> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
> margin-left: 0px; ">774 Dreese Lab, 2015 Neil Ave,</DIV><DIV =
> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
> margin-left: 0px; ">Dept. of Computer Science and Engineering</DIV><DIV =
> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
> margin-left: 0px; ">Ohio State University</DIV><DIV style=3D"margin-top: =
> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">OH =
> 43210</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
> margin-bottom: 0px; margin-left: 0px; ">Tel: (614)292-8501</DIV><DIV =
> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
> margin-left: 0px; min-height: 14px; "><BR></DIV><DIV style=3D"margin-top: =
> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
> min-height: 14px; "><BR></DIV><DIV style=3D"margin-top: 0px; =
> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">On Wed, 21 =
> Mar 2007, David Gunter wrote:</DIV><DIV style=3D"margin-top: 0px; =
> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; min-height: =
> 14px; "><BR></DIV> <BLOCKQUOTE type=3D"cite"><DIV style=3D"margin-top: =
> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">I have =
> built mvapich2 for an OFED-based IB Opteron cluster.<SPAN =
> class=3D"Apple-converted-space">=A0 </SPAN>When</DIV><DIV =
> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
> margin-left: 0px; ">running the Intel MPI Benchmarks (IMB3) I keep =
> seeing the following</DIV><DIV style=3D"margin-top: 0px; margin-right: =
> 0px; margin-bottom: 0px; margin-left: 0px; ">errors messages in many =
> spots, although the tests run to completion:</DIV><DIV =
> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
> margin-left: 0px; min-height: 14px; "><BR></DIV><DIV style=3D"margin-top: =
> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Internal =
> error: communicator is already on free list</DIV><DIV style=3D"margin-top:=
>  0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
> min-height: 14px; "><BR></DIV><DIV style=3D"margin-top: 0px; =
> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">What is this =
> referring to?</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></DIV><DIV =
> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
> margin-left: 0px; ">Thanks.</DIV><DIV style=3D"margin-top: 0px; =
> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
> ">--david</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></DIV><DIV =
> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
> margin-left: 0px; ">--</DIV><DIV style=3D"margin-top: 0px; margin-right: =
> 0px; margin-bottom: 0px; margin-left: 0px; ">David Gunter</DIV><DIV =
> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
> margin-left: 0px; ">HPC-4: HPC Environments: Parallel Tools =
> Team</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
> margin-bottom: 0px; margin-left: 0px; ">Los Alamos National =
> Laboratory</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></DIV><DIV =
> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
> margin-left: 0px; min-height: 14px; "><BR></DIV><DIV style=3D"margin-top: =
> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
> min-height: 14px; "><BR></DIV> </BLOCKQUOTE><DIV style=3D"margin-top: =
> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
> min-height: 14px; "><BR></DIV> =
> </BLOCKQUOTE></DIV><BR></DIV></BODY></HTML>=
> 
> --Apple-Mail-1-849288998--
> 
> --===============1750880957==
> Content-Type: text/plain; charset="us-ascii"
> MIME-Version: 1.0
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
> 
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> 
> --===============1750880957==--
> 



More information about the mvapich-discuss mailing list