[mvapich-discuss] Errors in BSBR when running xCbtest and xFbest of the BLACS

Claudio J. Margulis claudio-margulis at uiowa.edu
Thu May 16 10:28:06 EDT 2013


I guess it would be useful if I also paste my SL.make for scalapack:

############################################################################
#
#  Program:         ScaLAPACK
#
#  Module:          SLmake.inc
#
#  Purpose:         Top-level Definitions
#
#  Creation date:   February 15, 2000
#
#  Modified:        October 13, 2011
#
#  Send bug reports, comments or suggestions to scalapack at cs.utk.edu
#
############################################################################
#
#  C preprocessor definitions:  set CDEFS to one of the following:
#
#     -DNoChange (fortran subprogram names are lower case without any 
suffix)
#     -DUpCase   (fortran subprogram names are upper case without any 
suffix)
#     -DAdd_     (fortran subprogram names are lower case with "_" appended)

CDEFS         = -DAdd_

#
#  The fortran and C compilers, loaders, and their flags
#

FC            = 
/usr/local/chemistry_software/mvapich2-1.9/gcc-4.5.1/bin/mpif90
CC            = 
/usr/local/chemistry_software/mvapich2-1.9/gcc-4.5.1/bin/mpicc
NOOPT         = -O0
FCFLAGS       = -O3
CCFLAGS       = -O3
FCLOADER      = $(FC)
CCLOADER      = $(CC)
FCLOADFLAGS   = $(FCFLAGS)
CCLOADFLAGS   = $(CCFLAGS)

#
#  The archiver and the flag(s) to use when building archive (library)
#  Also the ranlib routine.  If your system has no ranlib, set RANLIB = echo
#

ARCH          = ar
ARCHFLAGS     = cr
RANLIB        = ranlib

#
#  The name of the ScaLAPACK library to be created
#

SCALAPACKLIB  = libscalapack.a

#
#  BLAS, LAPACK (and possibly other) libraries needed for linking test 
programs
#

#BLASLIB       =
LAPACKLIB     =
LIBS          = /shared/acml-4.4.0/gfortran64/lib/libacml.a



Claudio J. Margulis wrote:
> Dear Krishna, I don't think there are any special options: This were 
> the commands:
>
>
> gunzip mvapich2-1.9.tgz
> tar -xvf mvapich2-1.9.tar
> cd mvapich2-1.9
> export 
> LD_LIBRARY_PATH=/shared/gcc-4.5.1/lib64:/shared/gcc-4.5.1/lib:/shared/mpc-0.8.2/lib:/shared/mpfr-3.0.0/lib:/shared/gmp-4.3.2/lib
> ./configure 
> --prefix=/usr/local/chemistry_software/mvapich2-1.9/gcc-4.5.1 
> CC=/shared/gcc-4.5.1/bin/gcc CXX=/shared/gcc-4.5.1/bin/g++ 
> F77=/shared/gcc-4.5.1/bin/gfortran FC=/shared/gcc-4.5.1/bin/gfortran
>  make -j 16 >&make.log &
> make install
>
>
> cd scalapack-mvapich2-1.9/
> tar -xvf scalapack-2.0.2.tar
> cd scalapack-2.0.2
> export 
> LD_LIBRARY_PATH=/usr/local/chemistry_software/mvapich2-1.9/gcc-4.5.1/lib:$LD_LIBRARY_PATH 
>
> make all
> cd BLACS/TESTING/
>  /usr/local/chemistry_software/mvapich2-1.9/gcc-4.5.1/bin/mpirun -np 
> 16 ./xCbtest
>
> I don't want to paste all the errors I get but a sample follows for 
> the BSBR section:
>
> INTEGER BSBR TESTS: BEGIN.
>
> PROCESS {   0,   1} REPORTS ERRORS IN TEST#  2161:
>    Invalid element at A(   2,   1):
>    Expected=     -995413; Received=          -2
>    Complementory triangle overwrite at A(   1,   1):
>    Expected=          -2; Received=          -1
> PROCESS {   0,   1} DONE ERROR REPORT FOR TEST#  2161.
>
> PROCESS {   0,   1} REPORTS ERRORS IN TEST#  2162:
>    Invalid element at A(   2,   1):
>    Expected=     -219319; Received=          -2
> PROCESS {   0,   1} DONE ERROR REPORT FOR TEST#  2162.
>
> PROCESS {   0,   1} REPORTS ERRORS IN TEST#  3761:
>    Invalid element at A(   2,   1):
>    Expected=      574430; Received=          -2
>    Complementory triangle overwrite at A(   1,   1):
>    Expected=          -2; Received=          -1
> PROCESS {   0,   1} DONE ERROR REPORT FOR TEST#  3761.
>
> PROCESS {   0,   1} REPORTS ERRORS IN TEST#  4561:
>    Invalid element at A(   2,   1):
>    Expected=      716842; Received=          -2
>    Complementory triangle overwrite at A(   1,   1):
>    Expected=          -2; Received=          -1
> PROCESS {   0,   1} DONE ERROR REPORT FOR TEST#  4561.
>
> PROCESS {   1,   0} REPORTS ERRORS IN TEST#  1361:
>    Invalid element at A(   2,   1):
>    Expected=      862174; Received=          -2
>    Complementory triangle overwrite at A(   1,   1):
>    Expected=          -2; Received=          -1
> PROCESS {   1,   0} DONE ERROR REPORT FOR TEST#  1361.
>
> PROCESS {   1,   0} REPORTS ERRORS IN TEST#  2161:
>    Invalid element at A(   2,   1):
>    Expected=     -995413; Received=          -2
>    Complementory triangle overwrite at A(   1,   1):
>    Expected=          -2; Received=          -1
> PROCESS {   1,   0} DONE ERROR REPORT FOR TEST#  2161.
>
> PROCESS {   1,   0} REPORTS ERRORS IN TEST#  3761:
>    Invalid element at A(   2,   1):
>    Expected=      574430; Received=          -2
>
>
> These errors do not occur when using the old broadcast method (i.e. 
> with environmental variable MV2_USE_OLD_BCAST set to 1. There is also 
> the issue of timing but lets deal with one thing at a time.
>
> Furthermore, it seems like I am not the only one getting these errors. 
> If you look at my original posting there is a link to:
> http://fpmd.ucdavis.edu/qbox-list/viewtopic.php?p=290
> which reports exactly the same issues.
>
> Do you have any special setting that I may not be aware of that might 
> result in successful output in your case?
>
> Thanks for your help.
> Cheers,
> Claudio
>
> Krishna Kandalla wrote:
>> Hello Claudio,
>>
>>     I just tried running the xdqr test with 16 processes (one node) 
>> on the TACC Stampede cluster. The overall execution time for this 
>> test, with or without this flag does not seem to vary much. I am 
>> seeing about 1.02 - 1.06s as the total time. This test also completes 
>> correctly without the env variable that we had discussed. And, if it 
>> helps, I am also seeing that this test takes about 1.7s with 
>> Open-MPI-1.6.4.
>>     If you are using any specific configure/run-time options for the 
>> MVAPICH2-1.9 library, could you please share the details?
>>
>> Thanks,
>> Krishna
>>
>> On Wed, May 15, 2013 at 10:31 AM, Claudio J. Margulis 
>> <claudio-margulis at uiowa.edu <mailto:claudio-margulis at uiowa.edu>> wrote:
>>
>>     It seems that my mail did't go through so I am resending it.
>>     Please read below.
>>     Claudio
>>
>>
>>     Claudio J. Margulis wrote:
>>
>>         Dear Krishna, thanks for responding.
>>         Yes, with that environmental variable the errors are gone.
>>         However run time for the tests become extremely long.
>>         As an example a typical scalapack test
>>         mpirun -np 16 ./xdqr <QR.dat that takes a second to run with
>>         openmpi takes on the order of minutes with mvapich2.
>>
>>         Claudio
>>
>>
>>     --     signature.html Claudio J. Margulis
>>
>>     Associate Professor of Chemistry
>>     The University of Iowa
>>     Margulis Group Page
>> <http://www.chem.uiowa.edu/faculty/margulis/group/first.html>
>>
>>
>

-- 
signature.html Claudio J. Margulis
Associate Professor of Chemistry
The University of Iowa
Margulis Group Page 
<http://www.chem.uiowa.edu/faculty/margulis/group/first.html>



More information about the mvapich-discuss mailing list