[mvapich-discuss] Question about launch times with mpd

Mon Jun 30 13:34:16 EDT 2008

Chris Worley wrote:
> Craig,
> 
> I've run into this before w/ authentication mechanisms... which is why
> I always use local password files.
> 
> Specifically, I was using YP and YP scaling was so bad that ~2000
> processes would cause ssh timeouts, and the job would die.
> 
> You're not seeing anything this bad, and I doubt you're using YP, but
> try local password files (and like entries in /etc/nsswitch.conf).
> 
> Also, remove name server resolution too: everything in /etc/hosts, and
> no resolve.conf.
> 
> Hope this helps,
> 

We authenticate with local files (/etc/passwd) and ssh keys.
If this was the problem, then why does my code (replicating what
pdsh does, but in order) take only 2 seconds?

I did some timing as well around the MPI_init (which I couldn't complete
because users wanted to use the system), and most of the time seems
to be in there.  Also, I did a quick test where mpiexec launched a csh
script and that went very quick.

Craig

> Chris
> On Mon, Jun 30, 2008 at 10:38 AM, Craig Tierney <Craig.Tierney at noaa.gov> wrote:
>> Sorry to followup my own post, but I forget to provide
>> a few bits of information:
>>
>> - This happens with mvapich2 1.0 and 1.0.2p1
>> - I am running OFED 1.2.5.1 and Intel ifort/icc (both 9.1 and 10.1)
>> - Redhat 4.4
>> - Nodes are quad-core.  The mpd process is launched only
>> once per node, but jobs are started using 4 cpus per node.
>>
>> Craig
>>
>> Craig Tierney wrote:
>>> I am trying to benchmark some applications on my system
>>> and I have found something I did not expect with regards
>>> to launch time of applications.
>>>
>>> All jobs are launched through a batch system (Sun Gridengine).
>>> SGE is configured without tight-integration.  Since mpd has
>>> to be setup for each user in each job, I have a wrapper script
>>> that does roughly the following:
>>>
>>> -----------------------
>>> $me=`uname -n`
>>> $port=`mpd --ncpus=4 --echo --daemon --ifhn=$me-ib0`
>>>
>>> for every other node
>>>    ssh $node mpd --ncpus=4 -h $me -p $port --daemon --ifhn=$node-ib0 &
>>> end
>>> waitall
>>>
>>> mpiexec -machinefile $machine_file $EXE
>>> -----------------------
>>>
>>> I am running an MPI program that does very little.
>>> It calls mpi_init, writes the hostname, then calls mpi_finalize.
>>>
>>> I measured the time it takes to launch mpd, the time it
>>> takes for the program to execute (after MPI_init to just before
>>> MPI_Finalize), and the time to call MPI_Init.
>>>
>>> Cores  mpd   complete  runjob
>>> ------------------------------
>>> 4     ~0.0    ~0.0      0.6
>>> 16     0.6    ~0.0      0.8
>>> 64     0.7    ~0.0      3.5
>>> 128    1.0     0.2     11.2
>>> 256    1.7     0.6     47.0
>>> 324    2.1     0.1     76.4
>>> 512    3.1     0.5    202.4
>>>
>>> - All timings are in seconds.
>>>
>>> mpd - the time to launch the mpd processes in parallel
>>> complete - the time to run the application
>>> runjob - the time to execute mpiexec
>>>
>>> My question is, why does mpd take so long to launch a job?
>>> Am I doing something wrong?  Is there something I can do
>>> to minimize the startup time?
>>>
>>> Thanks,
>>> Craig
>>>
>>>
>>>
>>
>> --
>> Craig Tierney (craig.tierney at noaa.gov)
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> 

-- 
Craig Tierney (craig.tierney at noaa.gov)