[mvapich-discuss] MVAPICH2-0.9.8 internal errors

David Gunter dog at lanl.gov
Wed May 9 13:46:54 EDT 2007


I am curious to know if you were able to reproduce this problem,  
whether it has been fixed or not.

Thanks,
david
--
David Gunter
HPC-4: HPC Environments: Parallel Tools Team
Los Alamos National Laboratory


On Mar 22, 2007, at 11:20 AM, David Gunter wrote:

> I am using the released 0.9.8 version.
>
> -david
>
> On Mar 22, 2007, at 9:37 AM, Dhabaleswar Panda wrote:
>
>> Hi David,
>>
>> Thanks for this information. One more question - are you using
>> MVAPICH2 0.9.8 `released' version or the `branch' version (with some
>> recent fixes). If you can let us know this information, it will help
>> us.
>>
>> Thanks,
>>
>> DK
>>
>>
>>> I have recompiled mvapich2 without using the --enable-debuginfo flag
>>> and the problem has gone away.  However, I wish to have debuginfo
>>> available to our TotalView users so hopefully this can be resolved.
>>>
>>> Here is the configuration that generates the error message I saw
>>> previously:
>>>
>>> ./configure --prefix=/opt/mvapich2/mvapich2-0.9.8-gcc/ib --with-
>>> openib=/usr/local/ofed --enable-romio --with-file-system=ufs+nfs
>>> +panfs --enable-sharedlibs=gcc --enable-debuginfo --enable-fast --
>>> with-mpe
>>>
>>> OFED is ofed-1.1
>>>
>>> CFLAGS=-D_X86_64_ -D_SMP_ -DUSE_HEADER_CACHING  -DONE_SIDED -
>>> DMPID_USE_SEQUENCE_NUMBERS  -I/usr/local/ofed/include -O2   -
>>> D_SHMEM_COLL_
>>>
>>> CC=/usr/bin/gcc
>>> CXX=/usr/bin/g++
>>> FC=/usr/bin/gfortran
>>> F77=/usr/bin/gfortran
>>> F90=/usr/bin/gfortran
>>>
>>> The test was run on 32 process, 16 process, 8 process and 4  
>>> process -
>>> all of which generated this error message.
>>>
>>> Thanks,
>>> david
>>>
>>>
>>>
>>> --
>>> David Gunter
>>> HPC-4: HPC Environments: Parallel Tools Team
>>> Los Alamos National Laboratory
>>>
>>>
>>> On Mar 21, 2007, at 4:50 PM, wei huang wrote:
>>>
>>>> Hi,
>>>>
>>>> Thanks for letting us know this problem.
>>>>
>>>> We will try to reproduce the problem on our cluster. To help us on
>>>> looking
>>>> into this problem, would you please let us know the following:
>>>>
>>>> 1) The exact CFLAGS you are using when configuring mvapich2 (are
>>>> you using
>>>> default scripts provided by us?)
>>>> 2) Any runtime environment variables you have setup up?
>>>> 3) On how many nodes do you run the test?
>>>>
>>>> Thanks.
>>>>
>>>> Regards,
>>>> Wei Huang
>>>>
>>>> 774 Dreese Lab, 2015 Neil Ave,
>>>> Dept. of Computer Science and Engineering
>>>> Ohio State University
>>>> OH 43210
>>>> Tel: (614)292-8501
>>>>
>>>>
>>>> On Wed, 21 Mar 2007, David Gunter wrote:
>>>>
>>>>> I have built mvapich2 for an OFED-based IB Opteron cluster.  When
>>>>> running the Intel MPI Benchmarks (IMB3) I keep seeing the  
>>>>> following
>>>>> errors messages in many spots, although the tests run to  
>>>>> completion:
>>>>>
>>>>> Internal error: communicator is already on free list
>>>>>
>>>>> What is this referring to?
>>>>>
>>>>> Thanks.
>>>>> --david
>>>>>
>>>>> --
>>>>> David Gunter
>>>>> HPC-4: HPC Environments: Parallel Tools Team
>>>>> Los Alamos National Laboratory
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --Apple-Mail-1-849288998
>>> Content-Transfer-Encoding: quoted-printable
>>> Content-Type: text/html;
>>> 	charset=ISO-8859-1
>>>
>>> <HTML><BODY style=3D"word-wrap: break-word; -khtml-nbsp-mode:  
>>> space; =
>>> -khtml-line-break: after-white-space; ">I have recompiled mvapich2 =
>>> without using the --enable-debuginfo flag and the problem has gone =
>>> away.=A0 However, I wish to have debuginfo available to our  
>>> TotalView =
>>> users so hopefully this can be resolved.<DIV><BR =
>>> class=3D"khtml-block-placeholder"></DIV><DIV>Here is the  
>>> configuration =
>>> that generates the error message I saw previously:</DIV><DIV><BR =
>>> class=3D"khtml-block-placeholder"></DIV><DIV>./configure =
>>> --prefix=3D/opt/mvapich2/mvapich2-0.9.8-gcc/ib=A0--with-openib=3D/ 
>>> usr/loca=
>>> l/ofed --enable-romio --with-file-system=3Dufs+nfs+panfs =
>>> --enable-sharedlibs=3Dgcc --enable-debuginfo --enable-fast =
>>> --with-mpe</DIV><DIV><BR class=3D"khtml-block-placeholder"></ 
>>> DIV><DIV>OFED=
>>>  is ofed-1.1</DIV><DIV><BR =
>>> class=3D"khtml-block-placeholder"></DIV><DIV>CFLAGS=3D-D_X86_64_ - 
>>> D_SMP_ =
>>> -DUSE_HEADER_CACHING=A0 -DONE_SIDED -DMPID_USE_SEQUENCE_NUMBERS=A0 =
>>> -I/usr/local/ofed/include -O2=A0=A0 -D_SHMEM_COLL_</DIV><DIV><BR =
>>> class=3D"khtml-block-placeholder"></DIV><DIV>CC=3D/usr/bin/gcc</ 
>>> DIV><DIV>C=
>>> XX=3D/usr/bin/g++</DIV><DIV>FC=3D/usr/bin/gfortran</ 
>>> DIV><DIV>F77=3D/usr/bi=
>>> n/gfortran</DIV><DIV>F90=3D/usr/bin/gfortran</DIV><DIV><BR =
>>> class=3D"khtml-block-placeholder"></DIV><DIV>The test was run on  
>>> 32 =
>>> process, 16 process, 8 process and 4 process - all of which  
>>> generated =
>>> this error message.</DIV><DIV><BR =
>>> class=3D"khtml-block-placeholder"></DIV><DIV>Thanks,</ 
>>> DIV><DIV>david</DIV>=
>>> <DIV><BR class=3D"khtml-block-placeholder"></DIV><DIV><BR =
>>> class=3D"khtml-block-placeholder"></DIV><DIV><BR><DIV> <SPAN =
>>> class=3D"Apple-style-span" style=3D"border-collapse: separate; =
>>> border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family:  
>>> Helvetica; =
>>> font-size: 12px; font-style: normal; font-variant: normal; font- 
>>> weight: =
>>> normal; letter-spacing: normal; line-height: normal; text-align:  
>>> auto; =
>>> -khtml-text-decorations-in-effect: none; text-indent: 0px; =
>>> -apple-text-size-adjust: auto; text-transform: none; orphans: 2; =
>>> white-space: normal; widows: 2; word-spacing: 0px; "><SPAN =
>>> class=3D"Apple-style-span" style=3D"border-collapse: separate; =
>>> border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family:  
>>> Helvetica; =
>>> font-size: 12px; font-style: normal; font-variant: normal; font- 
>>> weight: =
>>> normal; letter-spacing: normal; line-height: normal; text-align:  
>>> auto; =
>>> -khtml-text-decorations-in-effect: none; text-indent: 0px; =
>>> -apple-text-size-adjust: auto; text-transform: none; orphans: 2; =
>>> white-space: normal; widows: 2; word-spacing: 0px; "><SPAN =
>>> class=3D"Apple-style-span" style=3D"border-collapse: separate; =
>>> border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family:  
>>> Helvetica; =
>>> font-size: 12px; font-style: normal; font-variant: normal; font- 
>>> weight: =
>>> normal; letter-spacing: normal; line-height: normal; text-align:  
>>> auto; =
>>> -khtml-text-decorations-in-effect: none; text-indent: 0px; =
>>> -apple-text-size-adjust: auto; text-transform: none; orphans: 2; =
>>> white-space: normal; widows: 2; word-spacing: 0px; "><SPAN =
>>> class=3D"Apple-style-span" style=3D"border-collapse: separate; =
>>> border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family:  
>>> Helvetica; =
>>> font-size: 12px; font-style: normal; font-variant: normal; font- 
>>> weight: =
>>> normal; letter-spacing: normal; line-height: normal; text-align:  
>>> auto; =
>>> -khtml-text-decorations-in-effect: none; text-indent: 0px; =
>>> -apple-text-size-adjust: auto; text-transform: none; orphans: 2; =
>>> white-space: normal; widows: 2; word-spacing: 0px; "><SPAN =
>>> class=3D"Apple-style-span" style=3D"border-collapse: separate; =
>>> border-spacing: 0px 0px; color: rgb(0, 0, 0); font-family:  
>>> Helvetica; =
>>> font-size: 12px; font-style: normal; font-variant: normal; font- 
>>> weight: =
>>> normal; letter-spacing: normal; line-height: normal; text-align:  
>>> auto; =
>>> -khtml-text-decorations-in-effect: none; text-indent: 0px; =
>>> -apple-text-size-adjust: auto; text-transform: none; orphans: 2; =
>>> white-space: normal; widows: 2; word-spacing: 0px; =
>>> "><DIV>--</DIV><DIV>David Gunter</DIV><DIV>HPC-4: HPC  
>>> Environments: =
>>> Parallel Tools Team</DIV><DIV>Los Alamos National =
>>> Laboratory</DIV></SPAN></SPAN><BR =
>>> class=3D"Apple-interchange-newline"></SPAN></SPAN></SPAN> =
>>> </DIV><BR><DIV><DIV>On Mar 21, 2007, at 4:50 PM, wei huang =
>>> wrote:</DIV><BR class=3D"Apple-interchange-newline"><BLOCKQUOTE =
>>> type=3D"cite"><DIV style=3D"margin-top: 0px; margin-right: 0px; =
>>> margin-bottom: 0px; margin-left: 0px; ">Hi,</DIV><DIV  
>>> style=3D"margin-top:=
>>>  0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
>>> min-height: 14px; "><BR></DIV><DIV style=3D"margin-top: 0px; =
>>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Thanks  
>>> for =
>>> letting us know this problem.</DIV><DIV style=3D"margin-top: 0px; =
>>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; min- 
>>> height: =
>>> 14px; "><BR></DIV><DIV style=3D"margin-top: 0px; margin-right:  
>>> 0px; =
>>> margin-bottom: 0px; margin-left: 0px; ">We will try to reproduce  
>>> the =
>>> problem on our cluster. To help us on looking</DIV><DIV =
>>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>>> margin-left: 0px; ">into this problem, would you please let us  
>>> know the =
>>> following:</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
>>> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></ 
>>> DIV><DIV =
>>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>>> margin-left: 0px; ">1) The exact CFLAGS you are using when  
>>> configuring =
>>> mvapich2 (are you using</DIV><DIV style=3D"margin-top: 0px; =
>>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">default =
>>> scripts provided by us?)</DIV><DIV style=3D"margin-top: 0px; =
>>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">2) Any =
>>> runtime environment variables you have setup up?</DIV><DIV =
>>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>>> margin-left: 0px; ">3) On how many nodes do you run the test?</ 
>>> DIV><DIV =
>>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>>> margin-left: 0px; min-height: 14px; "><BR></DIV><DIV  
>>> style=3D"margin-top: =
>>> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
>>> ">Thanks.</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
>>> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></ 
>>> DIV><DIV =
>>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>>> margin-left: 0px; ">Regards,</DIV><DIV style=3D"margin-top: 0px; =
>>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">Wei =
>>> Huang</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
>>> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></ 
>>> DIV><DIV =
>>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>>> margin-left: 0px; ">774 Dreese Lab, 2015 Neil Ave,</DIV><DIV =
>>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>>> margin-left: 0px; ">Dept. of Computer Science and Engineering</ 
>>> DIV><DIV =
>>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>>> margin-left: 0px; ">Ohio State University</DIV><DIV  
>>> style=3D"margin-top: =
>>> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">OH =
>>> 43210</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
>>> margin-bottom: 0px; margin-left: 0px; ">Tel: (614)292-8501</ 
>>> DIV><DIV =
>>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>>> margin-left: 0px; min-height: 14px; "><BR></DIV><DIV  
>>> style=3D"margin-top: =
>>> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
>>> min-height: 14px; "><BR></DIV><DIV style=3D"margin-top: 0px; =
>>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">On  
>>> Wed, 21 =
>>> Mar 2007, David Gunter wrote:</DIV><DIV style=3D"margin-top: 0px; =
>>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; min- 
>>> height: =
>>> 14px; "><BR></DIV> <BLOCKQUOTE type=3D"cite"><DIV style=3D"margin- 
>>> top: =
>>> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">I  
>>> have =
>>> built mvapich2 for an OFED-based IB Opteron cluster.<SPAN =
>>> class=3D"Apple-converted-space">=A0 </SPAN>When</DIV><DIV =
>>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>>> margin-left: 0px; ">running the Intel MPI Benchmarks (IMB3) I keep =
>>> seeing the following</DIV><DIV style=3D"margin-top: 0px; margin- 
>>> right: =
>>> 0px; margin-bottom: 0px; margin-left: 0px; ">errors messages in  
>>> many =
>>> spots, although the tests run to completion:</DIV><DIV =
>>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>>> margin-left: 0px; min-height: 14px; "><BR></DIV><DIV  
>>> style=3D"margin-top: =
>>> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px;  
>>> ">Internal =
>>> error: communicator is already on free list</DIV><DIV  
>>> style=3D"margin-top:=
>>>  0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
>>> min-height: 14px; "><BR></DIV><DIV style=3D"margin-top: 0px; =
>>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; ">What  
>>> is this =
>>> referring to?</DIV><DIV style=3D"margin-top: 0px; margin-right:  
>>> 0px; =
>>> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></ 
>>> DIV><DIV =
>>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>>> margin-left: 0px; ">Thanks.</DIV><DIV style=3D"margin-top: 0px; =
>>> margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
>>> ">--david</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
>>> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></ 
>>> DIV><DIV =
>>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>>> margin-left: 0px; ">--</DIV><DIV style=3D"margin-top: 0px; margin- 
>>> right: =
>>> 0px; margin-bottom: 0px; margin-left: 0px; ">David Gunter</ 
>>> DIV><DIV =
>>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>>> margin-left: 0px; ">HPC-4: HPC Environments: Parallel Tools =
>>> Team</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
>>> margin-bottom: 0px; margin-left: 0px; ">Los Alamos National =
>>> Laboratory</DIV><DIV style=3D"margin-top: 0px; margin-right: 0px; =
>>> margin-bottom: 0px; margin-left: 0px; min-height: 14px; "><BR></ 
>>> DIV><DIV =
>>> style=3D"margin-top: 0px; margin-right: 0px; margin-bottom: 0px; =
>>> margin-left: 0px; min-height: 14px; "><BR></DIV><DIV  
>>> style=3D"margin-top: =
>>> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
>>> min-height: 14px; "><BR></DIV> </BLOCKQUOTE><DIV style=3D"margin- 
>>> top: =
>>> 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; =
>>> min-height: 14px; "><BR></DIV> =
>>> </BLOCKQUOTE></DIV><BR></DIV></BODY></HTML>=
>>>
>>> --Apple-Mail-1-849288998--
>>>
>>> --===============1750880957==
>>> Content-Type: text/plain; charset="us-ascii"
>>> MIME-Version: 1.0
>>> Content-Transfer-Encoding: 7bit
>>> Content-Disposition: inline
>>>
>>> _______________________________________________
>>> mvapich-discuss mailing list
>>> mvapich-discuss at cse.ohio-state.edu
>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>
>>> --===============1750880957==--
>>>
>>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20070509/e80c65ef/attachment-0001.html


More information about the mvapich-discuss mailing list