[mvapich-discuss] charm++ / namd

Kevin Ball kball at pathscale.com
Fri Jun 2 12:59:36 EDT 2006


Roland,

  You comment that you are using the PathScale compilers for this.  I
think my colleague Les has already pointed you to some instructions we
have written about NAMD and Charm++, but as a quicker thing to try, can
you edit conv-mach.sh so that CMK_SEQ_CC is compiling at -O1 or -O0?  We
have a known compiler problem in one of the serial pieces of NAMD, that
this works around.  It has no effect on performance, as the code is well
outside of the critical path.

  Please let me know if this solves your problem.  Thanks!

Kevin Ball
Senior Engineer
SIG, QLogic (formerly PathScale).

On Fri, 2006-06-02 at 04:32, Roland Fehrenbacher wrote:
> >>>>> "Matthew" == Matthew Koop <koop at cse.ohio-state.edu> writes:
> 
> Matthew,
> 
> many thanks for your quick reply.
> 
>     Matthew> Roland, Are you able to run the OSU benchmarks, such as
>     Matthew> osu_bw and osu_latency, or do you see the problem there
>     Matthew> as well? If the problem exists there a setup issue needs
>     Matthew> to be worked out first.
> 
> Yes, everything works fine for other prograns.
> 
>     Matthew> If you are able to run basic MPI programs then we'll need
>     Matthew> to make sure everything was configured properly for
>     Matthew> charm.
> 
>     Matthew> Here's what I've done in the past to prepare charm++ for
>     Matthew> NAMD:
> 
>     Matthew> cd charm-5.9 cd ./src/arch
> 
>     Matthew> cp -r mpi-linux-amd64 mpi-linux-amd64-mvapich cd
>     Matthew> mpi-linux-amd64-mvapich
> 
>     Matthew> * edit conv-mach.h and change:
> 
>     Matthew> #define CMK_MALLOC_USE_GNU_MALLOC 1
>     Matthew> #define CMK_MALLOC_USE_OS_BUILTIN 0
> 
>     Matthew> to
> 
>     Matthew> #define CMK_MALLOC_USE_GNU_MALLOC 0
>     Matthew> #define CMK_MALLOC_USE_OS_BUILTIN 1
> 
> This indeed solved the problem. My hello (simplearrayhello) test works
> now. However, I still have a segmentation fault when trying to run the
> pgm test:
> 
> $ mpiexec  -comm mpich-ib ./pgm 12 10
> Megatest is running on 8 processors.
> test 0: initiated [bitvector (jbooth)]
> mpiexec: Warning: task 0 died with signal 11 (Segmentation fault).
> mpiexec: Warning: tasks 1-7 died with signal 15 (Terminated).
> 
> I compiled everything with Pathscale 2.3 (including mvapich). Do you
> have any experience with Pathscale compilers and charm++? The mvapich
> I built works fine with other stuff.
> 
> Thanks again,
> 
> Roland
> 
>     Matthew> * make sure the MVAPICH mpicc and mpiCC are first in your
>     Matthew> path. Otherwise, add the full path to the mpicc and mpiCC
>     Matthew> commands in conv_mach.sh
> 
>     Matthew> cd ../../..
> 
>     Matthew> ./build charm++ mpi-linux-amd64-mvapich --no-build-shared
> 
>     Matthew> cd tests/charm++/simplearrayhello make
> 
>     Matthew> [koop at bm1 simplearrayhello]$ mpirun_rsh -np 2 bm1 bm2
>     Matthew> ./hello Running Hello on 2 processors for 5 elements ...
> 
>     Matthew> Please let us know if this works for you or if you have
>     Matthew> any other problems.
> 
>     Matthew> Thanks,
> 
>     Matthew> Matthew Koop - Network-Based Computing Lab Ohio State
>     Matthew> University
> 
> 
>     Matthew> On Thu, 1 Jun 2006, Roland Fehrenbacher wrote:
> 
>     >> Hi,
>     >> 
>     >> I'm trying to get charm++ (http://charm.cs.uiuc.edu/ needed for
>     >> namd) to work with mvapich 0.9.7. While I can compile
>     >> everything fine, when executing a test job like hello (from
>     >> charm++) I get:
>     >> 
>     >> $ mpiexec -verbose -n 2 -comm mpich-ib ./hello mpiexec:
>     >> resolve_exe: using absolute exe "./hello".  mpiexec:
>     >> concurrent_init: old master died, reusing his fifo as master.
>     >> mpiexec: wait_task_start: start evt 2 task 0 on beo-15.
>     >> mpiexec: wait_task_start: start evt 3 task 1 on beo-15.
>     >> mpiexec: All 2 tasks started.  read_ib_startup_ports: waiting
>     >> for checkins read_ib_startup_ports: version 3 startup
>     >> read_ib_startup_ports: rank 0 checked in, 1 left
>     >> read_ib_startup_ports: rank 1 checked in, 0 left
>     >> read_ib_startup_ports: barrier start mpiexec: Error: read_full:
>     >> EOF, only 0 of 4 bytes.
>     >> 
>     >> , i.e. the job never starts.
>     >> 
>     >> $ mpirun_rsh -np 2 -hostfile /var/spool/pbs/aux/84.beosrv-c
>     >> ./hello
>     >> 
>     >> doesn't give any output.
>     >> 
>     >> A run with a single CPU works fine:
>     >> 
>     >> $ mpiexec -n 1 -comm mpich-ib ./hello Running Hello on 1
>     >> processors for 5 elements Hello 0 created Hello 1 created Hello
>     >> 2 created Hello 3 created Hello 4 created Hi[17] from element 0
>     >> Hi[18] from element 1 Hi[19] from element 2 Hi[20] from element
>     >> 3 Hi[21] from element 4 All done End of program
>     >> 
>     >> Any ideas?
>     >> 
>     >> Thanks,
>     >> 
>     >> Roland
>     >> 
>     >> _______________________________________________ mvapich-discuss
>     >> mailing list mvapich-discuss at cse.ohio-state.edu
>     >> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>     >> 
> 
> 
> 
> 
> 
> 
>     Matthew> _______________________________________________
>     Matthew> mvapich-discuss mailing list
>     Matthew> mvapich-discuss at cse.ohio-state.edu
>     Matthew> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> 
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss



More information about the mvapich-discuss mailing list