[mvapich-discuss] compile charm++ and namd with mvapich 1.0.1and/or mvapich2

Mehdi Bozzo-Rey mbozzore at platform.com
Fri Aug 15 08:34:35 EDT 2008


Hello Vlad,

 

I also have a lot of applications / libraries in fortran, this is why I used gfortran (which is part of the gcc suite anyway) as compiler for fortran77 and fortran90.

 

Please note that in that case you need to export the following environment variable (compile time) : F77_GETARGDECL=" ". 

 

At run time, you will have to  run the interactive fortran examples with the following variable as well: GFORTRAN_UNBUFFERED_ALL=y , as mentioned in the user guide (section 7.1.5 and 7.1.6: http://mvapich.cse.ohio-state.edu/support/mvapich_user_guide.html )

 

The story is  that by default, I/O is buffered (http://gcc.gnu.org/onlinedocs/gfortran/GFORTRAN_005fUNBUFFERED_005fALL.html) and with no option the example will appear to hang.

 

I also plan to use the Intel compilers and the Portland Group compilers for some applications.

 

Unfortunately, I don't have access to debian boxes ... our cluster stack  is more Red Hat (or CentOS) oriented (for now) ...

 

I tried out 1.2rc1 as well with (from config.log):

-----------------------------

It was created by configure, which was

generated by GNU Autoconf 2.59.  Invocation command line was

 

  $ ./configure --prefix=/home/mbozzore/mvapich2 --enable-f77 --enable-f90 --ena

ble-cxx --enable-sharedlibs=gcc --with-ib-libpath=/opt/ofed/lib64 --with-ib-incl

ude=/opt/ofed/include/ --with-rdma=gen2

-----------------------------

 

Cheers,

 

Mehdi

 

Mehdi Bozzo-Rey <mailto:mbozzore at platform.com> 

Open Source Solution Developer

Platform OCS5 <http://www.platform.com/Products/platform-open-cluster-stack5> 

Platform computing

Phone: +1 905 948 4649

 

 

 

 

 

 

From: Vlad Cojocaru [mailto:Vlad.Cojocaru at eml-r.villa-bosch.de] 
Sent: August-15-08 8:04 AM
To: Mehdi Bozzo-Rey
Cc: mvapich-discuss at cse.ohio-state.edu
Subject: Re: [mvapich-discuss] compile charm++ and namd with mvapich 1.0.1and/or mvapich2

 

Thanks Mehdi for all details,

I guess you mean gcc when you say gfortran ... namd is not written in fortran but in charm++ which is an adaptation of c++...

Well, we have debian here so we used Debian packages to install the inifiniband libs and headers ...(our sys administrator did that). Then I tried to compile mvapich 1.0.1 and I found that I need the drastically change the make.mvapich.gen2 file in order to get it to build (since the defaults for $IBHOME are very strange ... we have everything in /usr/include/infiniband and /usr/lib/infiniband ). After all I managed to get it built but the namd hangs .... 

So I decided to try mvapich2 (1.2rc1 version) and I found lots problems. Some of them I could fix but some are very strange. For instance in the entire source tree there are lots of references to strange directories /home/daffy ... or /home/7 ... and so on .. Some of them I replaced with ${master_top_srcdir} since I figured out that one should replace them but others I don't know ... Also, when I tried to build with shared libs, the make is not able to build the mpiname application ... I could not figure out why  ...

So, lots of problems ....I'll try to figure them out ... However, the problems with mvapich2 look more as bugs in the Makefiiles .. So, maybe somebody would like to change those ...

Cheers
vlad


Mehdi Bozzo-Rey wrote: 

Hi Vlad,

 

No, I did not use the intel compilers (not yet). I used gfortran. More precisely:

 

OS: 

 

RHEL 5.1 (Kernel 2.6.18-53.el5)

 

[mbozzore at tyan04 ~]$ mpicc --version

gcc (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42)

 

[mbozzore at tyan04 ~]$ mpicxx --version

g++ (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42)

 

[mbozzore at tyan04 ~]$ mpif77 --version

GNU Fortran (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42)

 

[mbozzore at tyan04 ~]$ mpif90 --version

GNU Fortran (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42)

 

Hardware: intel quads for the nodes, topspin switch and hcas for IB.

 

 

Yes, I used OFED (1.3).

 

I did not enable sharedlibs for that build.

 

I will double check but if I remember well, everything was fine (compilation) on the mvapich2 side. What version did you use ?

 

Cheers,

 

Mehdi

 

Mehdi Bozzo-Rey
Open Source Solution Developer
Platform computing
Phone : +1 905 948 4649
E-mail : mbozzore at platform.com

 

 

 

From: Cojocaru,Vlad [mailto:vlad.cojocaru at eml-r.villa-bosch.de] 
Sent: August-14-08 4:35 PM
To: Mehdi Bozzo-Rey; mvapich-discuss at cse.ohio-state.edu
Subject: RE: [mvapich-discuss] compile charm++ and namd with mvapich 1.0.1and/or mvapich2

 

Hi Mehdi,

Did you use intel 10.1 as well ? Did you build on openfabrics ? what compiler flags did you pass to the mvapich build? Did you build with --enable sharedlib or without? I would be grateful If you give me some bits of the details how you built mvapich?.
Thanks for the reply. Yes, there is something about the compilation of mvapich. As I said I successfully compiled NAMD on a cluster that had already mvapich compiled with intel as the default mpi lib. However, on the new cluster (quad cores AMD opterons with mellanox infiniband) I got these problems.  So, its definitely the mvapich build which fails although I don't get any errors fro make.

Any idea why the mpiname application fails to compile when compiling mvapich2 ?

Thanks again

Best wishes
vlad


-----Original Message-----
From: Mehdi Bozzo-Rey [mailto:mbozzore at platform.com]
Sent: Thu 8/14/2008 7:20 PM
To: Cojocaru,Vlad; mvapich-discuss at cse.ohio-state.edu
Subject: RE: [mvapich-discuss] compile charm++ and namd with mvapich 1.0.1and/or mvapich2

Hello Vlad,


I just recompiled NAMD and it looks ok for me (output of simple test below). I guess the problem is on the compilation side.

Best regards,

Mehdi

Mehdi Bozzo-Rey
Open Source Solution Developer
Platform computing
Phone : +1 905 948 4649
E-mail : mbozzore at platform.com


[mbozzore at tyan04 Linux-amd64-MPI]$ mpirun_rsh -np 8 -hostfile ./hosts.8 ./namd2 src/alanin
Charm++> Running on MPI version: 1.2 multi-thread support: 1/1
Charm warning> Randomization of stack pointer is turned on in Kernel, run 'echo 0 > /proc/sys/kernel/randomize_va_space' as root to disable it. Thread migration may not work!
Info: NAMD 2.6 for Linux-amd64-MPI
Info:
Info: Please visit http://www.ks.uiuc.edu/Research/namd/
Info: and send feedback or bug reports to namd at ks.uiuc.edu
Info:
Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
Info: in all publications reporting results obtained with NAMD.
Info:
Info: Based on Charm++/Converse 50914 for mpi-linux-x86_64-gfortran-smp-mpicxx
Info: Built Thu Aug 14 13:12:02 EDT 2008 by mbozzore on tyan04.lsf.platform.com
Info: 1 NAMD  2.6  Linux-amd64-MPI  8    compute-00-00.ocs5.org  mbozzore
Info: Running on 8 processors.
Info: 8208 kB of memory in use.
Info: Memory usage based on mallinfo
Info: Changed directory to src
Info: Configuration file is alanin
TCL: Suspending until startup complete.
Info: SIMULATION PARAMETERS:
Info: TIMESTEP               0.5
Info: NUMBER OF STEPS        9
Info: STEPS PER CYCLE        3
Info: LOAD BALANCE STRATEGY  Other
Info: LDB PERIOD             600 steps
Info: FIRST LDB TIMESTEP     15
Info: LDB BACKGROUND SCALING 1
Info: HOM BACKGROUND SCALING 1
Info: MAX SELF PARTITIONS    50
Info: MAX PAIR PARTITIONS    20
Info: SELF PARTITION ATOMS   125
Info: PAIR PARTITION ATOMS   200
Info: PAIR2 PARTITION ATOMS  400
Info: MIN ATOMS PER PATCH    100
Info: INITIAL TEMPERATURE    0
Info: CENTER OF MASS MOVING INITIALLY? NO
Info: DIELECTRIC             1
Info: EXCLUDE                SCALED ONE-FOUR
Info: 1-4 SCALE FACTOR       0.4
Info: NO DCD TRAJECTORY OUTPUT
Info: NO EXTENDED SYSTEM TRAJECTORY OUTPUT
Info: NO VELOCITY DCD OUTPUT
Info: OUTPUT FILENAME        output
Info: BINARY OUTPUT FILES WILL BE USED
Info: NO RESTART FILE
Info: SWITCHING ACTIVE
Info: SWITCHING ON           7
Info: SWITCHING OFF          8
Info: PAIRLIST DISTANCE      9
Info: PAIRLIST SHRINK RATE   0.01
Info: PAIRLIST GROW RATE     0.01
Info: PAIRLIST TRIGGER       0.3
Info: PAIRLISTS PER CYCLE    2
Info: PAIRLISTS ENABLED
Info: MARGIN                 1
Info: HYDROGEN GROUP CUTOFF  2.5
Info: PATCH DIMENSION        12.5
Info: CROSSTERM ENERGY INCLUDED IN DIHEDRAL
Info: TIMING OUTPUT STEPS    15
Info: USING VERLET I (r-RESPA) MTS SCHEME.
Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
Info: RANDOM NUMBER SEED     1218734148
Info: USE HYDROGEN BONDS?    NO
Info: COORDINATE PDB         alanin.pdb
Info: STRUCTURE FILE         alanin.psf
Info: PARAMETER file: XPLOR format! (default)
Info: PARAMETERS             alanin.params
Info: USING ARITHMETIC MEAN TO COMBINE L-J SIGMA PARAMETERS
Info: SUMMARY OF PARAMETERS:
Info: 61 BONDS
Info: 179 ANGLES
Info: 38 DIHEDRAL
Info: 42 IMPROPER
Info: 0 CROSSTERM
Info: 21 VDW
Info: 0 VDW_PAIRS
Info: ****************************
Info: STRUCTURE SUMMARY:
Info: 66 ATOMS
Info: 65 BONDS
Info: 96 ANGLES
Info: 31 DIHEDRALS
Info: 32 IMPROPERS
Info: 0 CROSSTERMS
Info: 0 EXCLUSIONS
Info: 195 DEGREES OF FREEDOM
Info: 55 HYDROGEN GROUPS
Info: TOTAL MASS = 783.886 amu
Info: TOTAL CHARGE = 8.19564e-08 e
Info: *****************************
Info: Entering startup phase 0 with 8208 kB of memory in use.
Info: Entering startup phase 1 with 8208 kB of memory in use.
Info: Entering startup phase 2 with 8208 kB of memory in use.
Info: Entering startup phase 3 with 8208 kB of memory in use.
Info: PATCH GRID IS 1 BY 1 BY 1
Info: REMOVING COM VELOCITY 0 0 0
Info: LARGEST PATCH (0) HAS 66 ATOMS
Info: CREATING 11 COMPUTE OBJECTS
Info: Entering startup phase 4 with 8208 kB of memory in use.
Info: Entering startup phase 5 with 8208 kB of memory in use.
Info: Entering startup phase 6 with 8208 kB of memory in use.
Measuring processor speeds... Done.
Info: Entering startup phase 7 with 8208 kB of memory in use.
Info: CREATING 11 COMPUTE OBJECTS
Info: NONBONDED TABLE R-SQUARED SPACING: 0.0625
Info: NONBONDED TABLE SIZE: 705 POINTS
Info: ABSOLUTE IMPRECISION IN FAST TABLE ENERGY: 3.38813e-21 AT 7.99609
Info: RELATIVE IMPRECISION IN FAST TABLE ENERGY: 1.27241e-16 AT 7.99609
Info: ABSOLUTE IMPRECISION IN FAST TABLE FORCE: 6.77626e-21 AT 7.99609
Info: RELATIVE IMPRECISION IN FAST TABLE FORCE: 1.1972e-16 AT 7.99609
Info: Entering startup phase 8 with 8208 kB of memory in use.
Info: Finished startup with 8208 kB of memory in use.
ETITLE:      TS           BOND          ANGLE          DIHED          IMPRP               ELECT            VDW       BOUNDARY           MISC        KINETIC               TOTAL           TEMP         TOTAL2         TOTAL3        TEMPAVG

ENERGY:       0         0.0050         0.4192         0.0368         0.4591           -210.1610         1.0506         0.0000         0.0000         0.0000           -208.1904         0.0000      -208.1877      -208.1877         0.0000

ENERGY:       1         0.0051         0.4196         0.0367         0.4585           -210.1611         1.0184         0.0000         0.0000         0.0325           -208.1905         0.1675      -208.1878      -208.1877         0.1675

ENERGY:       2         0.0058         0.4208         0.0365         0.4568           -210.1610         0.9219         0.0000         0.0000         0.1285           -208.1907         0.6632      -208.1881      -208.1877         0.6632

ENERGY:       3         0.0092         0.4232         0.0361         0.4542           -210.1599         0.7617         0.0000         0.0000         0.2845           -208.1910         1.4683      -208.1885      -208.1878         1.4683

ENERGY:       4         0.0176         0.4269         0.0356         0.4511           -210.1565         0.5386         0.0000         0.0000         0.4952           -208.1914         2.5561      -208.1890      -208.1878         2.5561

ENERGY:       5         0.0327         0.4327         0.0350         0.4480           -210.1489         0.2537         0.0000         0.0000         0.7552           -208.1917         3.8977      -208.1894      -208.1879         3.8977

ENERGY:       6         0.0552         0.4409         0.0343         0.4454           -210.1354        -0.0915         0.0000         0.0000         1.0592           -208.1920         5.4666      -208.1898      -208.1880         5.4666

ENERGY:       7         0.0839         0.4522         0.0334         0.4440           -210.1137        -0.4951         0.0000         0.0000         1.4031           -208.1922         7.2418      -208.1900      -208.1882         7.2418

ENERGY:       8         0.1162         0.4674         0.0325         0.4448           -210.0822        -0.9550         0.0000         0.0000         1.7839           -208.1923         9.2074      -208.1902      -208.1883         9.2074

ENERGY:       9         0.1492         0.4870         0.0315         0.4485           -210.0391        -1.4687         0.0000         0.0000         2.1990           -208.1925        11.3497      -208.1905      -208.1884        11.3497

WRITING EXTENDED SYSTEM TO OUTPUT FILE AT STEP 9
WRITING COORDINATES TO OUTPUT FILE AT STEP 9
WRITING VELOCITIES TO OUTPUT FILE AT STEP 9
==========================================
WallClock: 4.172574  CPUTime: 4.167367  Memory: 8208 kB
End of program





-----Original Message-----
From: mvapich-discuss-bounces at cse.ohio-state.edu [mailto:mvapich-discuss-bounces at cse.ohio-state.edu] On Behalf Of Vlad Cojocaru
Sent: August-14-08 11:32 AM
To: mvapich-discuss at cse.ohio-state.edu
Subject: [mvapich-discuss] compile charm++ and namd with mvapich 1.0.1and/or mvapich2

Dear mvapich users,

I tried to compile mvapich1.0.1, charm++ and namd on our new Linux-amd64
infiniband cluster using the intel 10.1.015 compilers. With
mvapich1.0.1, I managed to build mvapich1.0.1, tested the programs in
the /examples directory. Then, I bult charm++ and tested it with
"mpirun_rsh -n 2" .. All tests passed correctly. Then I built namd on
top of mvapich1.0.1 and charm,

Everything seemed ok only that the namd executable hangs without error
messages. In fact  it appears as if it still runs but it doesn't produce
any output. If I repeat exactly the same procedure but with openmpi
instead of mvapich, everything works fine ....(however I am not so happy
about the scaling of openmpi on infiniband)

Does anyone have experience with installing namd using mvapich1.0.1 ? If
yes, any idea why this happens? I must say when I did the same on
another cluster which had mvapich1.0.1 already compiled with the intel
compilers, everything worked out correcltly. So, it must be something
with the compilation of mvapich1.0.1 on our new infiniband setup that
creates the problem.

The german in the error simply says that executable "mpiname was not found"

Best wishes
vlad

----------------------------------error------------------------------------------------------------------------
I also tried mvapich2 but the compilation fails when installing the
mpiname application (see error below) which apparently fails to compile
(no executable is  found in /env/mpiname dir). However no error messages
are printed by make and the build completes correctly. So I am not sure
why mpiname does not compile and still make install tries to install it ...

/usr/bin/install -c  mpiname/mpiname
/sw/mcm/app/vlad/mpi/C07/mvapich2/1.2/bin/mpiname
/usr/bin/install: Aufruf von stat für âmpiname/mpinameâ nicht möglich:
Datei oder Verzeichnis nicht gefunden
make[1]: *** [install] Fehler 1
make[1]: Leaving directory
`/sw/mcm/app/vlad/mpi/C07/mvapich2/1.2-src/src/env'
make: *** [install] Fehler 2

--
----------------------------------------------------------------------------
Dr. Vlad Cojocaru

EML Research gGmbH
Schloss-Wolfsbrunnenweg 33
69118 Heidelberg

Tel: ++49-6221-533266
Fax: ++49-6221-533298

e-mail:Vlad.Cojocaru[at]eml-r.villa-bosch.de

http://projects.villa-bosch.de/mcm/people/cojocaru/

----------------------------------------------------------------------------
EML Research gGmbH
Amtgericht Mannheim / HRB 337446
Managing Partner: Dr. h.c. Klaus Tschira
Scientific and Managing Director: Prof. Dr.-Ing. Andreas Reuter
http://www.eml-r.org
----------------------------------------------------------------------------


_______________________________________________
mvapich-discuss mailing list
mvapich-discuss at cse.ohio-state.edu
http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss








-- 
----------------------------------------------------------------------------
Dr. Vlad Cojocaru
 
EML Research gGmbH
Schloss-Wolfsbrunnenweg 33
69118 Heidelberg
 
Tel: ++49-6221-533266
Fax: ++49-6221-533298
 
e-mail:Vlad.Cojocaru[at]eml-r.villa-bosch.de
 
http://projects.villa-bosch.de/mcm/people/cojocaru/
 
----------------------------------------------------------------------------
EML Research gGmbH
Amtgericht Mannheim / HRB 337446
Managing Partner: Dr. h.c. Klaus Tschira
Scientific and Managing Director: Prof. Dr.-Ing. Andreas Reuter
http://www.eml-r.org
----------------------------------------------------------------------------
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20080815/8cf643b6/attachment-0001.html


More information about the mvapich-discuss mailing list