[mvapich-discuss] MVAPICH2-0.9.8 internal errors

David Gunter dog at lanl.gov
Thu Mar 22 13:20:03 EDT 2007


I am using the released 0.9.8 version.

-david

On Mar 22, 2007, at 9:37 AM, Dhabaleswar Panda wrote:

> Hi David,
>
> Thanks for this information. One more question - are you using
> MVAPICH2 0.9.8 `released' version or the `branch' version (with some
> recent fixes). If you can let us know this information, it will help
> us.
>
> Thanks,
>
> DK
>
>
>> I have recompiled mvapich2 without using the --enable-debuginfo flag
>> and the problem has gone away.  However, I wish to have debuginfo
>> available to our TotalView users so hopefully this can be resolved.
>>
>> Here is the configuration that generates the error message I saw
>> previously:
>>
>> ./configure --prefix=/opt/mvapich2/mvapich2-0.9.8-gcc/ib --with-
>> openib=/usr/local/ofed --enable-romio --with-file-system=ufs+nfs
>> +panfs --enable-sharedlibs=gcc --enable-debuginfo --enable-fast --
>> with-mpe
>>
>> OFED is ofed-1.1
>>
>> CFLAGS=-D_X86_64_ -D_SMP_ -DUSE_HEADER_CACHING  -DONE_SIDED -
>> DMPID_USE_SEQUENCE_NUMBERS  -I/usr/local/ofed/include -O2   -
>> D_SHMEM_COLL_
>>
>> CC=/usr/bin/gcc
>> CXX=/usr/bin/g++
>> FC=/usr/bin/gfortran
>> F77=/usr/bin/gfortran
>> F90=/usr/bin/gfortran
>>
>> The test was run on 32 process, 16 process, 8 process and 4 process -
>> all of which generated this error message.
>>
>> Thanks,
>> david
>>
>>
>>
>> --
>> David Gunter
>> HPC-4: HPC Environments: Parallel Tools Team
>> Los Alamos National Laboratory
>>
>>
>> On Mar 21, 2007, at 4:50 PM, wei huang wrote:
>>
>>> Hi,
>>>
>>> Thanks for letting us know this problem.
>>>
>>> We will try to reproduce the problem on our cluster. To help us on
>>> looking
>>> into this problem, would you please let us know the following:
>>>
>>> 1) The exact CFLAGS you are using when configuring mvapich2 (are
>>> you using
>>> default scripts provided by us?)
>>> 2) Any runtime environment variables you have setup up?
>>> 3) On how many nodes do you run the test?
>>>
>>> Thanks.
>>>
>>> Regards,
>>> Wei Huang
>>>
>>> 774 Dreese Lab, 2015 Neil Ave,
>>> Dept. of Computer Science and Engineering
>>> Ohio State University
>>> OH 43210
>>> Tel: (614)292-8501
>>>
>>>
>>> On Wed, 21 Mar 2007, David Gunter wrote:
>>>
>>>> I have built mvapich2 for an OFED-based IB Opteron cluster.  When
>>>> running the Intel MPI Benchmarks (IMB3) I keep seeing the following
>>>> errors messages in many spots, although the tests run to  
>>>> completion:
>>>>
>>>> Internal error: communicator is already on free list
>>>>
>>>> What is this referring to?
>>>>
>>>> Thanks.
>>>> --david
>>>>
>>>> --
>>>> David Gunter
>>>> HPC-4: HPC Environments: Parallel Tools Team
>>>> Los Alamos National Laboratory
>>>>
>>>>
>>>>
>>>
>>
>>
>> --Apple-Mail-1-849288998
>> Content-Transfer-Encoding: quoted-printable
>> Content-Type: text/html;
>> 	charset=ISO-8859-1
>>
>> <HTML><BODY style=3D"word-wrap: break-word; -khtml-nbsp-mode:  
>> space; =
>> -khtml-line-break: after-white-space; ">I have recompiled mvapich2 =
>> without using the --enable-debuginfo flag and the problem has gone =
>> away.=A0 However, I wish to have debuginfo available to our  
>> TotalView =
>> users so hopefully this can be resolved.<DIV><BR =
>> class=3D"khtml-block-placeholder"></DIV><DIV>Here is the  
>> configuration =
>> that generates the error message I saw previously:</DIV><DIV><BR =
>> class=3D"khtml-block-placeholder"></DIV><DIV>./configure =
>> --prefix=3D/opt/mvapich2/mvapich2-0.9.8-gcc/ib=A0--with-openib=3D/ 
>> usr/loca=
>> l/ofed --enable-romio --with-file-system=3Dufs+nfs+panfs =
>> --enable-sharedlibs=3Dgcc --enable-debuginfo --enable-fast =
>> --with-mpe</DIV><DIV><BR class=3D"khtml-block-placeholder"></ 
>> DIV><DIV>OFED=
>>  is ofed-1.1</DIV><DIV><BR =
>> class=3D"khtml-block-placeholder"></DIV><DIV>CFLAGS=3D-D_X86_64_ - 
>> D_SMP_ =
>> -DUSE_HEADER_CACHING=A0 -DONE_SIDED -DMPID_USE_SEQUENCE_NUMBERS=A0 =
>> -I/usr/local/ofed/include -O2=A0=A0 -D_SHMEM_COLL_</DIV><DIV><BR =
>> class=3D"khtml-block-placeholder"></DIV><DIV>CC=3D/usr/bin/gcc</ 
>> DIV><DIV>C=
>> XX=3D/usr/bin/g++</DIV><DIV>FC=3D/usr/bin/gfortran</ 
>> DIV><DIV>F77=3D/usr/bi=
>> n/gfortran</DIV><DIV>F90=3D/usr/bin/gfortran</DIV><DIV><BR =
>> class=3D"khtml-block-placeholder"></DIV><DIV>The test was run on 32 =
>> process, 16 process, 8 process and 4 process - all of which  
>> generated =
>> this error message.</DIV><DIV><BR =
>> class=3D"khtml-block-placeholder"></DIV><DIV>Thanks,</ 
>> DIV><DIV>david</DIV>=
>> <DIV><BR class=3D"khtml-block-placeholder"></DIV><DIV><BR =
>> class=3D"khtml-block-placeholder"></DIV><DIV><BR><DIV> <SPAN =
>> class=3D"Apple-style-span" style=3D"border-collapse: separate; =
>> border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family:  
>> Helvetica; =
>> font-size: 12px; font-style: normal; font-variant: normal; font- 
>> weight: =
>> normal; letter-spacing: normal; line-height: normal; text-align:  
>> auto; =
>> -khtml-text-decorations-in-effect: none; text-indent: 0px; =
>> -apple-text-size-adjust: auto; text-transform: none; orphans: 2; =
>> white-space: normal; widows: 2; word-spacing: 0px; "><SPAN =
>> class=3D"Apple-style-span" style=3D"border-collapse: separate; =
>> border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family:  
>> Helvetica; =
>> font-size: 12px; font-style: normal; font-variant: normal; font- 
>> weight: =
>> normal; letter-spacing: normal; line-height: normal; text-align:  
>> auto; =
>> -khtml-text-decorations-in-effect: none; text-indent: 0px; =
>> -apple-text-size-adjust: auto; text-transform: none; orphans: 2; =
>> white-space: normal; widows: 2; word-spacing: 0px; "><SPAN =
>> class=3D"Apple-style-span" style=3D"border-collapse: separate; =
>> border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family:  
>> Helvetica; =
>> font-size: 12px; font-style: normal; font-variant: normal; font- 
>> weight: =
>> normal; letter-spacing: normal; line-height: normal; text-align:  
>> auto; =
>> -khtml-text-decorations-in-effect: none; text-indent: 0px; =
>> -apple-text-size-adjust: auto; text-transform: none; orphans: 2; =
>> white-space: normal; widows: 2; word-spacing: 0px; "><SPAN =
>> class=3D"Apple-style-span" style=3D"border-collapse: separate; =
>> border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family:  
>> Helvetica; =
>> font-size: 12px; font-style: normal; font-variant: normal; font- 
>> weight: =
>> normal; letter-spacing: normal; line-height: normal; text-align:  
>> auto; =
>> -khtml-text-decorations-in-effect: none; text-indent: 0px; =
>> -apple-text-size-adjust: auto; text-transform: none; orphans: 2; =
>> white-space: normal; widows: 2; word-spacing: 0px; "><SPAN =
>> class=3D"Apple-style-span" style=3D"border-collapse: separate; =
>> border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family:  
>> Helvetica; =
>> font-size: 12px; font-style: normal; font-variant: normal; font- 
>> weight: =
>> normal; letter-spacing: normal; line-height: normal; text-align:  
>> auto; =
>> -khtml-text-decorations-in-effect: none; text-indent: 0px; =
>> -apple-text-size-adjust: auto; text-transform: none; orphans: 2; =
>> white-space: normal; widows: 2; word-spacing: 0px; =
>> "><DIV>--</DIV><DIV>David Gunter</DIV><DIV>HPC-4: HPC Environments: =
>> Parallel Tools Team</DIV><DIV>Los Alamos National =
>> Laboratory</DIV></SPAN></SPAN><BR =
>> class=3D"Apple-interchange-newline"></SPAN></SPAN></SPAN> =
>> </DIV><BR><DIV><DIV>On Mar 21, 2007, at 4:50 PM, wei huang =
>> wrote:</DIV><BR class=3D"Apple-interchange-newline"><BLOCKQUOTE =
>> type=3D"cite"><DIV style=3D"margin-top: 0px; margin-right: 0px; =
>> margin-bottom: 0px; margin-left: 0px; ">Hi,</DIV><DIV  
>> style=3D"margin-top:=
>>  0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
>> min-height: 14px; "><BR></DIV><DIV style=3D"margin-top: 0px; =
>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Thanks  
>> for =
>> letting us know this problem.</DIV><DIV style=3D"margin-top: 0px; =
>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; min- 
>> height: =
>> 14px; "><BR></DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
>> margin-bottom: 0px; margin-left: 0px; ">We will try to reproduce  
>> the =
>> problem on our cluster. To help us on looking</DIV><DIV =
>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>> margin-left: 0px; ">into this problem, would you please let us  
>> know the =
>> following:</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
>> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></ 
>> DIV><DIV =
>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>> margin-left: 0px; ">1) The exact CFLAGS you are using when  
>> configuring =
>> mvapich2 (are you using</DIV><DIV style=3D"margin-top: 0px; =
>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">default =
>> scripts provided by us?)</DIV><DIV style=3D"margin-top: 0px; =
>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">2) Any =
>> runtime environment variables you have setup up?</DIV><DIV =
>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>> margin-left: 0px; ">3) On how many nodes do you run the test?</ 
>> DIV><DIV =
>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>> margin-left: 0px; min-height: 14px; "><BR></DIV><DIV  
>> style=3D"margin-top: =
>> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
>> ">Thanks.</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
>> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></ 
>> DIV><DIV =
>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>> margin-left: 0px; ">Regards,</DIV><DIV style=3D"margin-top: 0px; =
>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Wei =
>> Huang</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
>> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></ 
>> DIV><DIV =
>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>> margin-left: 0px; ">774 Dreese Lab, 2015 Neil Ave,</DIV><DIV =
>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>> margin-left: 0px; ">Dept. of Computer Science and Engineering</ 
>> DIV><DIV =
>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>> margin-left: 0px; ">Ohio State University</DIV><DIV  
>> style=3D"margin-top: =
>> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">OH =
>> 43210</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
>> margin-bottom: 0px; margin-left: 0px; ">Tel: (614)292-8501</ 
>> DIV><DIV =
>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>> margin-left: 0px; min-height: 14px; "><BR></DIV><DIV  
>> style=3D"margin-top: =
>> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
>> min-height: 14px; "><BR></DIV><DIV style=3D"margin-top: 0px; =
>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">On Wed,  
>> 21 =
>> Mar 2007, David Gunter wrote:</DIV><DIV style=3D"margin-top: 0px; =
>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; min- 
>> height: =
>> 14px; "><BR></DIV> <BLOCKQUOTE type=3D"cite"><DIV style=3D"margin- 
>> top: =
>> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">I  
>> have =
>> built mvapich2 for an OFED-based IB Opteron cluster.<SPAN =
>> class=3D"Apple-converted-space">=A0 </SPAN>When</DIV><DIV =
>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>> margin-left: 0px; ">running the Intel MPI Benchmarks (IMB3) I keep =
>> seeing the following</DIV><DIV style=3D"margin-top: 0px; margin- 
>> right: =
>> 0px; margin-bottom: 0px; margin-left: 0px; ">errors messages in  
>> many =
>> spots, although the tests run to completion:</DIV><DIV =
>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>> margin-left: 0px; min-height: 14px; "><BR></DIV><DIV  
>> style=3D"margin-top: =
>> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;  
>> ">Internal =
>> error: communicator is already on free list</DIV><DIV  
>> style=3D"margin-top:=
>>  0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
>> min-height: 14px; "><BR></DIV><DIV style=3D"margin-top: 0px; =
>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">What is  
>> this =
>> referring to?</DIV><DIV style=3D"margin-top: 0px; margin-right:  
>> 0px; =
>> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></ 
>> DIV><DIV =
>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>> margin-left: 0px; ">Thanks.</DIV><DIV style=3D"margin-top: 0px; =
>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
>> ">--david</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
>> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></ 
>> DIV><DIV =
>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>> margin-left: 0px; ">--</DIV><DIV style=3D"margin-top: 0px; margin- 
>> right: =
>> 0px; margin-bottom: 0px; margin-left: 0px; ">David Gunter</DIV><DIV =
>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>> margin-left: 0px; ">HPC-4: HPC Environments: Parallel Tools =
>> Team</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
>> margin-bottom: 0px; margin-left: 0px; ">Los Alamos National =
>> Laboratory</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
>> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></ 
>> DIV><DIV =
>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>> margin-left: 0px; min-height: 14px; "><BR></DIV><DIV  
>> style=3D"margin-top: =
>> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
>> min-height: 14px; "><BR></DIV> </BLOCKQUOTE><DIV style=3D"margin- 
>> top: =
>> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
>> min-height: 14px; "><BR></DIV> =
>> </BLOCKQUOTE></DIV><BR></DIV></BODY></HTML>=
>>
>> --Apple-Mail-1-849288998--
>>
>> --===============1750880957==
>> Content-Type: text/plain; charset="us-ascii"
>> MIME-Version: 1.0
>> Content-Transfer-Encoding: 7bit
>> Content-Disposition: inline
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>> --===============1750880957==--
>>
>



More information about the mvapich-discuss mailing list