[mvapich-discuss] xcbrd tests

Bas van der Vlies basv at sara.nl
Sat Apr 7 03:31:11 EDT 2007


Shaun,

  First of all we use the same fortran compilers for all the packages  
that is why we send the results to this list. I will read your web  
pages and build the libraries as you suggested and let you know what  
the result is. My question is what version of mvapich2 did you use.  
we have version 0.9.8 with 4 patches applied.

Which gfortran/gcc version. Ours is  4.1.1

Regards and a happy easter

On Apr 7, 2007, at 6:03 AM, Shaun Rowland wrote:

> Bas van der Vlies wrote:
>> Just tried mvapich1 version 0.9.9 fro svn and this also fails the  
>> xcbrd test:
>> Relative machine precision (eps) is taken to be       0.596046E-07
>> Routines pass computational tests if scaled residual is less than  
>> 10.000
>> TIME      M      N  NB     P     Q  BRD Time      MFLOPS Residual   
>> CHECK
>> ---- ------ ------ --- ----- ----- --------- ----------- --------  
>> ------
>> WALL      4      4   2     1     1      0.00        0.00     0.58  
>> PASSED
>> ||A - Q*B*P|| / (||A|| * N * eps) =                       NaN
>> WALL      4      4   3     1     1      0.00        0.00      NaN  
>> FAILED
>> ||A - Q*B*P|| / (||A|| * N * eps) =                       NaN
>> WALL      4      4   4     1     1      0.00        0.00      NaN  
>> FAILED
>
> Hi Bas. I've been looking into this issue for a while. I believe I
> know what the problem is. I built the following packages:
>
> BLACS
> ATLAS
> ScaLAPACK (using the two above)
>
> I built these four times for the following MPI installations:
>
> MVAPICH 0.9.9 with gfortran
> MVAPICH 0.9.9 with g77
> MVAPICH2 0.9.8 with gfortran
> MVAPICH2 0.9.8 with g77
>
> The only time I had a problem was when I accidentally built the ATLAS
> package with g77 and tried to use it with the other packages that were
> built with gfortran. I thought this was probably the problem here
> anyway, but since I accidentally did this the first time - I could see
> that I got the same errors as you had reported for that one case. I
> believe your problem is that all of your packages, including
> MVAPICH/MVAPICH2, were not built with the same Fortran compiler. The
> Fortran compiler needs to be common in all builds or you will run into
> problems, and the problems won't be apparent until you try to use the
> libraries and get strange results. This is exactly the type of  
> situation
> you've reported.
>
> If you make sure that ScaLAPACK is built with either g77 or gfortran,
> matching the MVAPICH/MVAPICH2 build you want to test - and also any of
> the dependencies of ScaLAPACK as well - then these strange problems
> should go away. The Fortran compiler needs to be the same all  
> around. I
> have not yet run every ScaLAPACK test, but I ran the ones you  
> reported.
> I had no issues when the Fortran compilers were uniform. Only when  
> there
> was a cross g77/gfortran built library introduced did I see the same
> behavior.
>
> I have notes on how I built the packages listed above:
>
> BLACS
> -----
> http://www.cse.ohio-state.edu/~rowland/work/blacs.html
>
> ATLAS
> -----
> http://www.cse.ohio-state.edu/~rowland/work/atlas.html
>
> ScaLAPACK
> ---------
> http://www.cse.ohio-state.edu/~rowland/work/scalapack.html
>
> Maybe those notes will be useful to compare with. On a side note:  
> if you
> are using shared library builds of MVAPICH/MVAPICH2 to test, be sure
> that the path to libmpich.a is not used in any of these configuration
> files because the mpicc and mpif77 commands place the path of the  
> shared
> library right into the binary result, and this will cause problems if
> the programs are statically linked in a weird way like this. No one
> would or should do that sort of thing, but with these configuration
> files you need to edit - it's possible to make a mistake here. If you
> are using a static library build of MVAPICH/MVAPICH2, this does not
> matter. The steps I've outlined note this appropriately and do the  
> right
> thing.
>
> Also, you do need GFORTRAN_UNBUFFERED_ALL=y set in your environment  
> for
> the gfortran cases. For MVAPICH2, simply export that variable. For
> MVAPICH it needs to be specified on the mpirun_rsh command line:
>
> mpirun_rsh -np 4 host1 host2 host3 host4  
> GFORTRAN_UNBUFFERED_ALL=y ./test
>
> for example. This is noted on the web pages above too.
> -- 
> Shaun Rowland	rowland at cse.ohio-state.edu
> http://www.cse.ohio-state.edu/~rowland/

--
Bas van der Vlies
basv at sara.nl





More information about the mvapich-discuss mailing list