[mvapich-discuss] Question about launch times with mpd

Craig Tierney Craig.Tierney at noaa.gov
Mon Jun 30 12:38:40 EDT 2008


Sorry to followup my own post, but I forget to provide
a few bits of information:

- This happens with mvapich2 1.0 and 1.0.2p1
- I am running OFED 1.2.5.1 and Intel ifort/icc (both 9.1 and 10.1)
- Redhat 4.4
- Nodes are quad-core.  The mpd process is launched only
once per node, but jobs are started using 4 cpus per node.

Craig

Craig Tierney wrote:
> I am trying to benchmark some applications on my system
> and I have found something I did not expect with regards
> to launch time of applications.
> 
> All jobs are launched through a batch system (Sun Gridengine).
> SGE is configured without tight-integration.  Since mpd has
> to be setup for each user in each job, I have a wrapper script
> that does roughly the following:
> 
> -----------------------
> $me=`uname -n`
> $port=`mpd --ncpus=4 --echo --daemon --ifhn=$me-ib0`
> 
> for every other node
>     ssh $node mpd --ncpus=4 -h $me -p $port --daemon --ifhn=$node-ib0 &
> end
> waitall
> 
> mpiexec -machinefile $machine_file $EXE
> -----------------------
> 
> I am running an MPI program that does very little.
> It calls mpi_init, writes the hostname, then calls mpi_finalize.
> 
> I measured the time it takes to launch mpd, the time it
> takes for the program to execute (after MPI_init to just before
> MPI_Finalize), and the time to call MPI_Init.
> 
> Cores  mpd   complete  runjob
> ------------------------------
> 4     ~0.0    ~0.0      0.6
> 16     0.6    ~0.0      0.8
> 64     0.7    ~0.0      3.5
> 128    1.0     0.2     11.2
> 256    1.7     0.6     47.0
> 324    2.1     0.1     76.4
> 512    3.1     0.5    202.4
> 
> - All timings are in seconds.
> 
> mpd - the time to launch the mpd processes in parallel
> complete - the time to run the application
> runjob - the time to execute mpiexec
> 
> My question is, why does mpd take so long to launch a job?
> Am I doing something wrong?  Is there something I can do
> to minimize the startup time?
> 
> Thanks,
> Craig
> 
> 
> 


-- 
Craig Tierney (craig.tierney at noaa.gov)


More information about the mvapich-discuss mailing list