[mvapich-discuss] charm++ / namd

Matthew Koop koop at cse.ohio-state.edu
Thu Jun 1 18:05:03 EDT 2006


Roland,

Are you able to run the OSU benchmarks, such as osu_bw and osu_latency,
or do you see the problem there as well? If the problem exists there a
setup issue needs to be worked out first.

If you are able to run basic MPI programs then we'll need to make sure
everything was configured properly for charm.

Here's what I've done in the past to prepare charm++ for NAMD:

cd charm-5.9
cd ./src/arch

cp -r mpi-linux-amd64 mpi-linux-amd64-mvapich
cd mpi-linux-amd64-mvapich

* edit conv-mach.h and change:

#define CMK_MALLOC_USE_GNU_MALLOC                          1
#define CMK_MALLOC_USE_OS_BUILTIN                          0

to

#define CMK_MALLOC_USE_GNU_MALLOC                          0
#define CMK_MALLOC_USE_OS_BUILTIN                          1

* make sure the MVAPICH mpicc and mpiCC are first in your path. Otherwise,
add the full path to the mpicc and mpiCC commands in conv_mach.sh

cd ../../..

./build charm++ mpi-linux-amd64-mvapich --no-build-shared

cd tests/charm++/simplearrayhello
make

[koop at bm1 simplearrayhello]$ mpirun_rsh -np 2 bm1 bm2 ./hello
Running Hello on 2 processors for 5 elements
Hello 0 created
Hello 2 created
Hello 4 created
Hi[17] from element 0
Hi[19] from element 2
Hello 1 created
Hello 3 created
Hi[21] from element 4
All done
Hi[18] from element 1
Hi[20] from element 3
End of program


Please let us know if this works for you or if you have any other
problems.

Thanks,

Matthew Koop
-
Network-Based Computing Lab
Ohio State University


On Thu, 1 Jun 2006, Roland Fehrenbacher wrote:

> Hi,
>
> I'm trying to get charm++ (http://charm.cs.uiuc.edu/ needed for namd)
> to work with mvapich 0.9.7. While I can compile everything fine, when
> executing a test job like hello (from charm++) I get:
>
> $ mpiexec  -verbose -n 2 -comm mpich-ib ./hello
> mpiexec: resolve_exe: using absolute exe "./hello".
> mpiexec: concurrent_init: old master died, reusing his fifo as master.
> mpiexec: wait_task_start: start evt 2 task 0 on beo-15.
> mpiexec: wait_task_start: start evt 3 task 1 on beo-15.
> mpiexec: All 2 tasks started.
> read_ib_startup_ports: waiting for checkins
> read_ib_startup_ports: version 3 startup
> read_ib_startup_ports: rank 0 checked in, 1 left
> read_ib_startup_ports: rank 1 checked in, 0 left
> read_ib_startup_ports: barrier start
> mpiexec: Error: read_full: EOF, only 0 of 4 bytes.
>
> , i.e. the job never starts.
>
> $ mpirun_rsh  -np 2 -hostfile /var/spool/pbs/aux/84.beosrv-c ./hello
>
> doesn't give any output.
>
> A run with a single CPU works fine:
>
> $ mpiexec  -n 1 -comm mpich-ib ./hello
> Running Hello on 1 processors for 5 elements
> Hello 0 created
> Hello 1 created
> Hello 2 created
> Hello 3 created
> Hello 4 created
> Hi[17] from element 0
> Hi[18] from element 1
> Hi[19] from element 2
> Hi[20] from element 3
> Hi[21] from element 4
> All done
> End of program
>
> Any ideas?
>
> Thanks,
>
> Roland
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>








More information about the mvapich-discuss mailing list