[mvapich-discuss] Problem with slow start in mpirun_rsh using
mvapich1.1
Jaidev Sridhar
sridharj at cse.ohio-state.edu
Tue Dec 2 12:46:18 EST 2008
Hi Terrance,
This timeout means we failed to launch mpispawn on some node within a
reasonable amount of time. This could be due to ssh other network
issues. Some node isn't accepting ssh connections or has a large delay.
If you wan a larger timeout, you need to export MPIRUN_TIMEOUT (seconds) -
$ export MPIRUN_TIMEOUT=11111
$ mpirun_rsh ...
You can also try to use rsh with -rsh flag to mpirun_rsh.
-Jaidev
On Tuesday 02 December 2008 10:32 AM, Terrence.LIAO at total.com wrote:
>
> Dear mvapich,
>
> I have this slow start problem and do not know how to fix it. I am
> trying to run a very simple hello world and using this typical command:
> mpirun_rsh -hostfile host.list -np 27 ./mpi_hello.exe on our IB cluster,
> from time to time it will run but in most case it give me "Timeout
> during client startup". Any advice to fix this problem. Is
> VIA_CM_TIMEOUT a parameter I should tune for this?
>
> Thank you very much.
>
> -- Terrence
> --------------------------------------------------------
> Terrence Liao, Ph.D.
> Research Computer Scientist
> TOTAL E&P RESEARCH & TECHNOLOGY USA, LLC
> 1201 Louisiana, Suite 1800, Houston, TX 77002
> Tel: 713.647.3498 Fax: 713.647.3638
> Email: terrence.liao at total.com
>
>
> ------------------------------------------------------------------------
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
More information about the mvapich-discuss
mailing list