[mvapich-discuss] compile charm++ and namd with mvapich 1.0.1and/or mvapich2

Vlad Cojocaru Vlad.Cojocaru at eml-r.villa-bosch.de
Fri Aug 15 08:04:26 EDT 2008


Thanks Mehdi for all details,

I guess you mean gcc when you say gfortran ... namd is not written in 
fortran but in charm++ which is an adaptation of c++...

Well, we have debian here so we used Debian packages to install the 
inifiniband libs and headers ...(our sys administrator did that). Then I 
tried to compile mvapich 1.0.1 and I found that I need the drastically 
change the make.mvapich.gen2 file in order to get it to build (since the 
defaults for $IBHOME are very strange ... we have everything in 
/usr/include/infiniband and /usr/lib/infiniband ). After all I managed 
to get it built but the namd hangs ....

So I decided to try mvapich2 (1.2rc1 version) and I found lots problems. 
Some of them I could fix but some are very strange. For instance in the 
entire source tree there are lots of references to strange directories 
/home/daffy ... or /home/7 ... and so on .. Some of them I replaced with 
${master_top_srcdir} since I figured out that one should replace them 
but others I don't know ... Also, when I tried to build with shared 
libs, the make is not able to build the mpiname application ... I could 
not figure out why  ...

So, lots of problems ....I'll try to figure them out ... However, the 
problems with mvapich2 look more as bugs in the Makefiiles .. So, maybe 
somebody would like to change those ...

Cheers
vlad


Mehdi Bozzo-Rey wrote:
>
> Hi Vlad,
>
>  
>
> No, I did not use the intel compilers (not yet). I used gfortran. More 
> precisely:
>
>  
>
> OS:
>
>  
>
> RHEL 5.1 (Kernel 2.6.18-53.el5)
>
>  
>
> [mbozzore at tyan04 ~]$ mpicc --version
>
> gcc (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42)
>
>  
>
> [mbozzore at tyan04 ~]$ mpicxx --version
>
> g++ (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42)
>
>  
>
> [mbozzore at tyan04 ~]$ mpif77 --version
>
> GNU Fortran (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42)
>
>  
>
> [mbozzore at tyan04 ~]$ mpif90 --version
>
> GNU Fortran (GCC) 4.1.2 20071124 (Red Hat 4.1.2-42)
>
>  
>
> Hardware: intel quads for the nodes, topspin switch and hcas for IB.
>
>  
>
>  
>
> Yes, I used OFED (1.3).
>
>  
>
> I did not enable sharedlibs for that build.
>
>  
>
> I will double check but if I remember well, everything was fine 
> (compilation) on the mvapich2 side. What version did you use ?
>
>  
>
> Cheers,
>
>  
>
> Mehdi
>
>  
>
> Mehdi Bozzo-Rey
> Open Source Solution Developer
> Platform computing
> Phone : +1 905 948 4649
> E-mail : mbozzore at platform.com <mailto:mbozzore at platform.com>
>
>  
>
>  
>
>  
>
> *From:* Cojocaru,Vlad [mailto:vlad.cojocaru at eml-r.villa-bosch.de]
> *Sent:* August-14-08 4:35 PM
> *To:* Mehdi Bozzo-Rey; mvapich-discuss at cse.ohio-state.edu
> *Subject:* RE: [mvapich-discuss] compile charm++ and namd with mvapich 
> 1.0.1and/or mvapich2
>
>  
>
> Hi Mehdi,
>
> Did you use intel 10.1 as well ? Did you build on openfabrics ? what 
> compiler flags did you pass to the mvapich build? Did you build with 
> --enable sharedlib or without? I would be grateful If you give me some 
> bits of the details how you built mvapich?.
> Thanks for the reply. Yes, there is something about the compilation of 
> mvapich. As I said I successfully compiled NAMD on a cluster that had 
> already mvapich compiled with intel as the default mpi lib. However, 
> on the new cluster (quad cores AMD opterons with mellanox infiniband) 
> I got these problems.  So, its definitely the mvapich build which 
> fails although I don't get any errors fro make.
>
> Any idea why the mpiname application fails to compile when compiling 
> mvapich2 ?
>
> Thanks again
>
> Best wishes
> vlad
>
>
> -----Original Message-----
> From: Mehdi Bozzo-Rey [mailto:mbozzore at platform.com]
> Sent: Thu 8/14/2008 7:20 PM
> To: Cojocaru,Vlad; mvapich-discuss at cse.ohio-state.edu
> Subject: RE: [mvapich-discuss] compile charm++ and namd with mvapich 
> 1.0.1and/or mvapich2
>
> Hello Vlad,
>
>
> I just recompiled NAMD and it looks ok for me (output of simple test 
> below). I guess the problem is on the compilation side.
>
> Best regards,
>
> Mehdi
>
> Mehdi Bozzo-Rey
> Open Source Solution Developer
> Platform computing
> Phone : +1 905 948 4649
> E-mail : mbozzore at platform.com
>
>
> [mbozzore at tyan04 Linux-amd64-MPI]$ mpirun_rsh -np 8 -hostfile 
> ./hosts.8 ./namd2 src/alanin
> Charm++> Running on MPI version: 1.2 multi-thread support: 1/1
> Charm warning> Randomization of stack pointer is turned on in Kernel, 
> run 'echo 0 > /proc/sys/kernel/randomize_va_space' as root to disable 
> it. Thread migration may not work!
> Info: NAMD 2.6 for Linux-amd64-MPI
> Info:
> Info: Please visit http://www.ks.uiuc.edu/Research/namd/
> Info: and send feedback or bug reports to namd at ks.uiuc.edu
> Info:
> Info: Please cite Phillips et al., J. Comp. Chem. 26:1781-1802 (2005)
> Info: in all publications reporting results obtained with NAMD.
> Info:
> Info: Based on Charm++/Converse 50914 for 
> mpi-linux-x86_64-gfortran-smp-mpicxx
> Info: Built Thu Aug 14 13:12:02 EDT 2008 by mbozzore on 
> tyan04.lsf.platform.com
> Info: 1 NAMD  2.6  Linux-amd64-MPI  8    compute-00-00.ocs5.org  mbozzore
> Info: Running on 8 processors.
> Info: 8208 kB of memory in use.
> Info: Memory usage based on mallinfo
> Info: Changed directory to src
> Info: Configuration file is alanin
> TCL: Suspending until startup complete.
> Info: SIMULATION PARAMETERS:
> Info: TIMESTEP               0.5
> Info: NUMBER OF STEPS        9
> Info: STEPS PER CYCLE        3
> Info: LOAD BALANCE STRATEGY  Other
> Info: LDB PERIOD             600 steps
> Info: FIRST LDB TIMESTEP     15
> Info: LDB BACKGROUND SCALING 1
> Info: HOM BACKGROUND SCALING 1
> Info: MAX SELF PARTITIONS    50
> Info: MAX PAIR PARTITIONS    20
> Info: SELF PARTITION ATOMS   125
> Info: PAIR PARTITION ATOMS   200
> Info: PAIR2 PARTITION ATOMS  400
> Info: MIN ATOMS PER PATCH    100
> Info: INITIAL TEMPERATURE    0
> Info: CENTER OF MASS MOVING INITIALLY? NO
> Info: DIELECTRIC             1
> Info: EXCLUDE                SCALED ONE-FOUR
> Info: 1-4 SCALE FACTOR       0.4
> Info: NO DCD TRAJECTORY OUTPUT
> Info: NO EXTENDED SYSTEM TRAJECTORY OUTPUT
> Info: NO VELOCITY DCD OUTPUT
> Info: OUTPUT FILENAME        output
> Info: BINARY OUTPUT FILES WILL BE USED
> Info: NO RESTART FILE
> Info: SWITCHING ACTIVE
> Info: SWITCHING ON           7
> Info: SWITCHING OFF          8
> Info: PAIRLIST DISTANCE      9
> Info: PAIRLIST SHRINK RATE   0.01
> Info: PAIRLIST GROW RATE     0.01
> Info: PAIRLIST TRIGGER       0.3
> Info: PAIRLISTS PER CYCLE    2
> Info: PAIRLISTS ENABLED
> Info: MARGIN                 1
> Info: HYDROGEN GROUP CUTOFF  2.5
> Info: PATCH DIMENSION        12.5
> Info: CROSSTERM ENERGY INCLUDED IN DIHEDRAL
> Info: TIMING OUTPUT STEPS    15
> Info: USING VERLET I (r-RESPA) MTS SCHEME.
> Info: C1 SPLITTING OF LONG RANGE ELECTROSTATICS
> Info: PLACING ATOMS IN PATCHES BY HYDROGEN GROUPS
> Info: RANDOM NUMBER SEED     1218734148
> Info: USE HYDROGEN BONDS?    NO
> Info: COORDINATE PDB         alanin.pdb
> Info: STRUCTURE FILE         alanin.psf
> Info: PARAMETER file: XPLOR format! (default)
> Info: PARAMETERS             alanin.params
> Info: USING ARITHMETIC MEAN TO COMBINE L-J SIGMA PARAMETERS
> Info: SUMMARY OF PARAMETERS:
> Info: 61 BONDS
> Info: 179 ANGLES
> Info: 38 DIHEDRAL
> Info: 42 IMPROPER
> Info: 0 CROSSTERM
> Info: 21 VDW
> Info: 0 VDW_PAIRS
> Info: ****************************
> Info: STRUCTURE SUMMARY:
> Info: 66 ATOMS
> Info: 65 BONDS
> Info: 96 ANGLES
> Info: 31 DIHEDRALS
> Info: 32 IMPROPERS
> Info: 0 CROSSTERMS
> Info: 0 EXCLUSIONS
> Info: 195 DEGREES OF FREEDOM
> Info: 55 HYDROGEN GROUPS
> Info: TOTAL MASS = 783.886 amu
> Info: TOTAL CHARGE = 8.19564e-08 e
> Info: *****************************
> Info: Entering startup phase 0 with 8208 kB of memory in use.
> Info: Entering startup phase 1 with 8208 kB of memory in use.
> Info: Entering startup phase 2 with 8208 kB of memory in use.
> Info: Entering startup phase 3 with 8208 kB of memory in use.
> Info: PATCH GRID IS 1 BY 1 BY 1
> Info: REMOVING COM VELOCITY 0 0 0
> Info: LARGEST PATCH (0) HAS 66 ATOMS
> Info: CREATING 11 COMPUTE OBJECTS
> Info: Entering startup phase 4 with 8208 kB of memory in use.
> Info: Entering startup phase 5 with 8208 kB of memory in use.
> Info: Entering startup phase 6 with 8208 kB of memory in use.
> Measuring processor speeds... Done.
> Info: Entering startup phase 7 with 8208 kB of memory in use.
> Info: CREATING 11 COMPUTE OBJECTS
> Info: NONBONDED TABLE R-SQUARED SPACING: 0.0625
> Info: NONBONDED TABLE SIZE: 705 POINTS
> Info: ABSOLUTE IMPRECISION IN FAST TABLE ENERGY: 3.38813e-21 AT 7.99609
> Info: RELATIVE IMPRECISION IN FAST TABLE ENERGY: 1.27241e-16 AT 7.99609
> Info: ABSOLUTE IMPRECISION IN FAST TABLE FORCE: 6.77626e-21 AT 7.99609
> Info: RELATIVE IMPRECISION IN FAST TABLE FORCE: 1.1972e-16 AT 7.99609
> Info: Entering startup phase 8 with 8208 kB of memory in use.
> Info: Finished startup with 8208 kB of memory in use.
> ETITLE:      TS           BOND          ANGLE          DIHED          
> IMPRP               ELECT            VDW       BOUNDARY           
> MISC        KINETIC               TOTAL           TEMP         
> TOTAL2         TOTAL3        TEMPAVG
>
> ENERGY:       0         0.0050         0.4192         0.0368         
> 0.4591           -210.1610         1.0506         0.0000         
> 0.0000         0.0000           -208.1904         0.0000      
> -208.1877      -208.1877         0.0000
>
> ENERGY:       1         0.0051         0.4196         0.0367         
> 0.4585           -210.1611         1.0184         0.0000         
> 0.0000         0.0325           -208.1905         0.1675      
> -208.1878      -208.1877         0.1675
>
> ENERGY:       2         0.0058         0.4208         0.0365         
> 0.4568           -210.1610         0.9219         0.0000         
> 0.0000         0.1285           -208.1907         0.6632      
> -208.1881      -208.1877         0.6632
>
> ENERGY:       3         0.0092         0.4232         0.0361         
> 0.4542           -210.1599         0.7617         0.0000         
> 0.0000         0.2845           -208.1910         1.4683      
> -208.1885      -208.1878         1.4683
>
> ENERGY:       4         0.0176         0.4269         0.0356         
> 0.4511           -210.1565         0.5386         0.0000         
> 0.0000         0.4952           -208.1914         2.5561      
> -208.1890      -208.1878         2.5561
>
> ENERGY:       5         0.0327         0.4327         0.0350         
> 0.4480           -210.1489         0.2537         0.0000         
> 0.0000         0.7552           -208.1917         3.8977      
> -208.1894      -208.1879         3.8977
>
> ENERGY:       6         0.0552         0.4409         0.0343         
> 0.4454           -210.1354        -0.0915         0.0000         
> 0.0000         1.0592           -208.1920         5.4666      
> -208.1898      -208.1880         5.4666
>
> ENERGY:       7         0.0839         0.4522         0.0334         
> 0.4440           -210.1137        -0.4951         0.0000         
> 0.0000         1.4031           -208.1922         7.2418      
> -208.1900      -208.1882         7.2418
>
> ENERGY:       8         0.1162         0.4674         0.0325         
> 0.4448           -210.0822        -0.9550         0.0000         
> 0.0000         1.7839           -208.1923         9.2074      
> -208.1902      -208.1883         9.2074
>
> ENERGY:       9         0.1492         0.4870         0.0315         
> 0.4485           -210.0391        -1.4687         0.0000         
> 0.0000         2.1990           -208.1925        11.3497      
> -208.1905      -208.1884        11.3497
>
> WRITING EXTENDED SYSTEM TO OUTPUT FILE AT STEP 9
> WRITING COORDINATES TO OUTPUT FILE AT STEP 9
> WRITING VELOCITIES TO OUTPUT FILE AT STEP 9
> ==========================================
> WallClock: 4.172574  CPUTime: 4.167367  Memory: 8208 kB
> End of program
>
>
>
>
>
> -----Original Message-----
> From: mvapich-discuss-bounces at cse.ohio-state.edu 
> [mailto:mvapich-discuss-bounces at cse.ohio-state.edu] On Behalf Of Vlad 
> Cojocaru
> Sent: August-14-08 11:32 AM
> To: mvapich-discuss at cse.ohio-state.edu
> Subject: [mvapich-discuss] compile charm++ and namd with mvapich 
> 1.0.1and/or mvapich2
>
> Dear mvapich users,
>
> I tried to compile mvapich1.0.1, charm++ and namd on our new Linux-amd64
> infiniband cluster using the intel 10.1.015 compilers. With
> mvapich1.0.1, I managed to build mvapich1.0.1, tested the programs in
> the /examples directory. Then, I bult charm++ and tested it with
> "mpirun_rsh -n 2" .. All tests passed correctly. Then I built namd on
> top of mvapich1.0.1 and charm,
>
> Everything seemed ok only that the namd executable hangs without error
> messages. In fact  it appears as if it still runs but it doesn't produce
> any output. If I repeat exactly the same procedure but with openmpi
> instead of mvapich, everything works fine ....(however I am not so happy
> about the scaling of openmpi on infiniband)
>
> Does anyone have experience with installing namd using mvapich1.0.1 ? If
> yes, any idea why this happens? I must say when I did the same on
> another cluster which had mvapich1.0.1 already compiled with the intel
> compilers, everything worked out correcltly. So, it must be something
> with the compilation of mvapich1.0.1 on our new infiniband setup that
> creates the problem.
>
> The german in the error simply says that executable "mpiname was not 
> found"
>
> Best wishes
> vlad
>
> ----------------------------------error------------------------------------------------------------------------
> I also tried mvapich2 but the compilation fails when installing the
> mpiname application (see error below) which apparently fails to compile
> (no executable is  found in /env/mpiname dir). However no error messages
> are printed by make and the build completes correctly. So I am not sure
> why mpiname does not compile and still make install tries to install 
> it ...
>
> /usr/bin/install -c  mpiname/mpiname
> /sw/mcm/app/vlad/mpi/C07/mvapich2/1.2/bin/mpiname
> /usr/bin/install: Aufruf von stat für âmpiname/mpinameâ nicht möglich:
> Datei oder Verzeichnis nicht gefunden
> make[1]: *** [install] Fehler 1
> make[1]: Leaving directory
> `/sw/mcm/app/vlad/mpi/C07/mvapich2/1.2-src/src/env'
> make: *** [install] Fehler 2
>
> --
> ----------------------------------------------------------------------------
> Dr. Vlad Cojocaru
>
> EML Research gGmbH
> Schloss-Wolfsbrunnenweg 33
> 69118 Heidelberg
>
> Tel: ++49-6221-533266
> Fax: ++49-6221-533298
>
> e-mail:Vlad.Cojocaru[at]eml-r.villa-bosch.de
>
> http://projects.villa-bosch.de/mcm/people/cojocaru/
>
> ----------------------------------------------------------------------------
> EML Research gGmbH
> Amtgericht Mannheim / HRB 337446
> Managing Partner: Dr. h.c. Klaus Tschira
> Scientific and Managing Director: Prof. Dr.-Ing. Andreas Reuter
> http://www.eml-r.org
> ----------------------------------------------------------------------------
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>

-- 
----------------------------------------------------------------------------
Dr. Vlad Cojocaru

EML Research gGmbH
Schloss-Wolfsbrunnenweg 33
69118 Heidelberg

Tel: ++49-6221-533266
Fax: ++49-6221-533298

e-mail:Vlad.Cojocaru[at]eml-r.villa-bosch.de

http://projects.villa-bosch.de/mcm/people/cojocaru/

----------------------------------------------------------------------------
EML Research gGmbH
Amtgericht Mannheim / HRB 337446
Managing Partner: Dr. h.c. Klaus Tschira
Scientific and Managing Director: Prof. Dr.-Ing. Andreas Reuter
http://www.eml-r.org
----------------------------------------------------------------------------


-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20080815/fa4822ca/attachment-0001.html


More information about the mvapich-discuss mailing list