[mvapich-discuss] Errors in BSBR when running xCbtest and xFbest
of the BLACS
Krishna Kandalla
kandalla at cse.ohio-state.edu
Mon May 20 16:57:39 EDT 2013
Hello Claudio,
Thanks for checking back with us. We are working on this issue. We will
be posting an update on the mvapich-discuss list soon.
Thanks,
Krishna
On Mon, May 20, 2013 at 4:07 PM, Margulis, Claudio J <
claudio-margulis at uiowa.edu> wrote:
> Dear Krishna, I very much appreciate you looking into this and passing
> it to developers. I am replying to the mailing list so that the thread does
> not remain without conclusion. Hopefully MVAPICH2 will be able to properly
> deal with scalapack and the BLACS in future releases. I think it is
> important for people compiling their codes with MVAPICH2 to know that this
> release version does not pass the BLACS tests.
>
> Is there a way we can get informed when a patch is released that will
> resolve this issue? Many quantum mechanics codes use these linear algebra
> routines.
>
> Thanks,
> cheers
> Claudio
>
> ------------------------------
> *From:* krishna.kandalla at gmail.com [krishna.kandalla at gmail.com] on behalf
> of Krishna Kandalla [kandalla at cse.ohio-state.edu]
> *Sent:* Thursday, May 16, 2013 10:19 AM
> *To:* Margulis, Claudio J
> *Cc:* MVAPICH-Core
> *Subject:* Re: [mvapich-discuss] Errors in BSBR when running xCbtest and
> xFbest of the BLACS
>
> Hi Claudio,
> Thanks for sharing the details. We see the same error message
> with the xCbtest. We will continue working on this issue.
> (I am CC'ing our internal developer list)
>
> Thanks,
> Krishna
>
> On Thu, May 16, 2013 at 10:28 AM, Claudio J. Margulis <
> claudio-margulis at uiowa.edu> wrote:
>
>> I guess it would be useful if I also paste my SL.make for scalapack:
>>
>> ##############################**##############################**
>> ################
>> #
>> # Program: ScaLAPACK
>> #
>> # Module: SLmake.inc
>> #
>> # Purpose: Top-level Definitions
>> #
>> # Creation date: February 15, 2000
>> #
>> # Modified: October 13, 2011
>> #
>> # Send bug reports, comments or suggestions to scalapack at cs.utk.edu
>> #
>> ##############################**##############################**
>> ################
>> #
>> # C preprocessor definitions: set CDEFS to one of the following:
>> #
>> # -DNoChange (fortran subprogram names are lower case without any
>> suffix)
>> # -DUpCase (fortran subprogram names are upper case without any
>> suffix)
>> # -DAdd_ (fortran subprogram names are lower case with "_"
>> appended)
>>
>> CDEFS = -DAdd_
>>
>> #
>> # The fortran and C compilers, loaders, and their flags
>> #
>>
>> FC = /usr/local/chemistry_software/**
>> mvapich2-1.9/gcc-4.5.1/bin/**mpif90
>> CC = /usr/local/chemistry_software/**
>> mvapich2-1.9/gcc-4.5.1/bin/**mpicc
>> NOOPT = -O0
>> FCFLAGS = -O3
>> CCFLAGS = -O3
>> FCLOADER = $(FC)
>> CCLOADER = $(CC)
>> FCLOADFLAGS = $(FCFLAGS)
>> CCLOADFLAGS = $(CCFLAGS)
>>
>> #
>> # The archiver and the flag(s) to use when building archive (library)
>> # Also the ranlib routine. If your system has no ranlib, set RANLIB =
>> echo
>> #
>>
>> ARCH = ar
>> ARCHFLAGS = cr
>> RANLIB = ranlib
>>
>> #
>> # The name of the ScaLAPACK library to be created
>> #
>>
>> SCALAPACKLIB = libscalapack.a
>>
>> #
>> # BLAS, LAPACK (and possibly other) libraries needed for linking test
>> programs
>> #
>>
>> #BLASLIB =
>> LAPACKLIB =
>> LIBS = /shared/acml-4.4.0/gfortran64/**lib/libacml.a
>>
>>
>>
>>
>> Claudio J. Margulis wrote:
>>
>>> Dear Krishna, I don't think there are any special options: This were the
>>> commands:
>>>
>>>
>>> gunzip mvapich2-1.9.tgz
>>> tar -xvf mvapich2-1.9.tar
>>> cd mvapich2-1.9
>>> export LD_LIBRARY_PATH=/shared/gcc-4.**5.1/lib64:/shared/gcc-4.5.1/**
>>> lib:/shared/mpc-0.8.2/lib:/**shared/mpfr-3.0.0/lib:/shared/**
>>> gmp-4.3.2/lib
>>> ./configure --prefix=/usr/local/chemistry_**
>>> software/mvapich2-1.9/gcc-4.5.**1 CC=/shared/gcc-4.5.1/bin/gcc
>>> CXX=/shared/gcc-4.5.1/bin/g++ F77=/shared/gcc-4.5.1/bin/**gfortran
>>> FC=/shared/gcc-4.5.1/bin/**gfortran
>>> make -j 16 >&make.log &
>>> make install
>>>
>>>
>>> cd scalapack-mvapich2-1.9/
>>> tar -xvf scalapack-2.0.2.tar
>>> cd scalapack-2.0.2
>>> export LD_LIBRARY_PATH=/usr/local/**chemistry_software/mvapich2-1.**
>>> 9/gcc-4.5.1/lib:$LD_LIBRARY_**PATH
>>> make all
>>> cd BLACS/TESTING/
>>> /usr/local/chemistry_software/**mvapich2-1.9/gcc-4.5.1/bin/**mpirun
>>> -np 16 ./xCbtest
>>>
>>> I don't want to paste all the errors I get but a sample follows for the
>>> BSBR section:
>>>
>>> INTEGER BSBR TESTS: BEGIN.
>>>
>>> PROCESS { 0, 1} REPORTS ERRORS IN TEST# 2161:
>>> Invalid element at A( 2, 1):
>>> Expected= -995413; Received= -2
>>> Complementory triangle overwrite at A( 1, 1):
>>> Expected= -2; Received= -1
>>> PROCESS { 0, 1} DONE ERROR REPORT FOR TEST# 2161.
>>>
>>> PROCESS { 0, 1} REPORTS ERRORS IN TEST# 2162:
>>> Invalid element at A( 2, 1):
>>> Expected= -219319; Received= -2
>>> PROCESS { 0, 1} DONE ERROR REPORT FOR TEST# 2162.
>>>
>>> PROCESS { 0, 1} REPORTS ERRORS IN TEST# 3761:
>>> Invalid element at A( 2, 1):
>>> Expected= 574430; Received= -2
>>> Complementory triangle overwrite at A( 1, 1):
>>> Expected= -2; Received= -1
>>> PROCESS { 0, 1} DONE ERROR REPORT FOR TEST# 3761.
>>>
>>> PROCESS { 0, 1} REPORTS ERRORS IN TEST# 4561:
>>> Invalid element at A( 2, 1):
>>> Expected= 716842; Received= -2
>>> Complementory triangle overwrite at A( 1, 1):
>>> Expected= -2; Received= -1
>>> PROCESS { 0, 1} DONE ERROR REPORT FOR TEST# 4561.
>>>
>>> PROCESS { 1, 0} REPORTS ERRORS IN TEST# 1361:
>>> Invalid element at A( 2, 1):
>>> Expected= 862174; Received= -2
>>> Complementory triangle overwrite at A( 1, 1):
>>> Expected= -2; Received= -1
>>> PROCESS { 1, 0} DONE ERROR REPORT FOR TEST# 1361.
>>>
>>> PROCESS { 1, 0} REPORTS ERRORS IN TEST# 2161:
>>> Invalid element at A( 2, 1):
>>> Expected= -995413; Received= -2
>>> Complementory triangle overwrite at A( 1, 1):
>>> Expected= -2; Received= -1
>>> PROCESS { 1, 0} DONE ERROR REPORT FOR TEST# 2161.
>>>
>>> PROCESS { 1, 0} REPORTS ERRORS IN TEST# 3761:
>>> Invalid element at A( 2, 1):
>>> Expected= 574430; Received= -2
>>>
>>>
>>> These errors do not occur when using the old broadcast method (i.e. with
>>> environmental variable MV2_USE_OLD_BCAST set to 1. There is also the issue
>>> of timing but lets deal with one thing at a time.
>>>
>>> Furthermore, it seems like I am not the only one getting these errors.
>>> If you look at my original posting there is a link to:
>>> http://fpmd.ucdavis.edu/qbox-**list/viewtopic.php?p=290<http://fpmd.ucdavis.edu/qbox-list/viewtopic.php?p=290>
>>> which reports exactly the same issues.
>>>
>>> Do you have any special setting that I may not be aware of that might
>>> result in successful output in your case?
>>>
>>> Thanks for your help.
>>> Cheers,
>>> Claudio
>>>
>>> Krishna Kandalla wrote:
>>>
>>>> Hello Claudio,
>>>>
>>>> I just tried running the xdqr test with 16 processes (one node) on
>>>> the TACC Stampede cluster. The overall execution time for this test, with
>>>> or without this flag does not seem to vary much. I am seeing about 1.02 -
>>>> 1.06s as the total time. This test also completes correctly without the env
>>>> variable that we had discussed. And, if it helps, I am also seeing that
>>>> this test takes about 1.7s with Open-MPI-1.6.4.
>>>> If you are using any specific configure/run-time options for the
>>>> MVAPICH2-1.9 library, could you please share the details?
>>>>
>>>> Thanks,
>>>> Krishna
>>>>
>>>> On Wed, May 15, 2013 at 10:31 AM, Claudio J. Margulis <
>>>> claudio-margulis at uiowa.edu <mailto:claudio-margulis@**uiowa.edu<claudio-margulis at uiowa.edu>>>
>>>> wrote:
>>>>
>>>> It seems that my mail did't go through so I am resending it.
>>>> Please read below.
>>>> Claudio
>>>>
>>>>
>>>> Claudio J. Margulis wrote:
>>>>
>>>> Dear Krishna, thanks for responding.
>>>> Yes, with that environmental variable the errors are gone.
>>>> However run time for the tests become extremely long.
>>>> As an example a typical scalapack test
>>>> mpirun -np 16 ./xdqr <QR.dat that takes a second to run with
>>>> openmpi takes on the order of minutes with mvapich2.
>>>>
>>>> Claudio
>>>>
>>>>
>>>> -- signature.html Claudio J. Margulis
>>>>
>>>> Associate Professor of Chemistry
>>>> The University of Iowa
>>>> Margulis Group Page
>>>> <http://www.chem.uiowa.edu/**faculty/margulis/group/first.**html<http://www.chem.uiowa.edu/faculty/margulis/group/first.html>
>>>> >
>>>>
>>>>
>>>>
>>>
>> --
>> signature.html Claudio J. Margulis
>> Associate Professor of Chemistry
>> The University of Iowa
>> Margulis Group Page <http://www.chem.uiowa.edu/**
>> faculty/margulis/group/first.**html<http://www.chem.uiowa.edu/faculty/margulis/group/first.html>
>> >
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20130520/d3768553/attachment-0001.html
More information about the mvapich-discuss
mailing list