[mvapich-discuss] Error 'End of file reached on hostfile'

Dhabaleswar Panda panda at cse.ohio-state.edu
Wed Mar 5 21:29:37 EST 2008


Hi,

Thanks for your note. This error message will be printed when there is a
higher number of procs specified than the number of entries in the
hostfile.

How is `mpirun_rsh' invoked in both cases? Are they same or different?

It looks like PBS Pro 8 might not be scheduling processes when the node
name is indicated multiple times in the host file. Is there any option in
PBS to disable this? Have you checked with PBS folks regarding this?
Unfortunately, we do not have access to PBS Pro.  Thus, we are not able to
reproduce it.

DK

On Wed, 5 Mar 2008, Pawel Dziekonski wrote:

> hi,
>
> i'm using mvapich-0.9.9-1458 (the one that comes OFED 1.2.5.5) and it
> emits error 'End of file reached on hostfile at 2 of 4 hostnames' when
> machinefile contains the same hostname more than once. this happens
> only for some applications, like CPMD or Amber, basic tests or HPL
> work ok. machinefile is generated by PBS Pro 8 queueing system and
> looks very simple, eg:
>
> wn152
> wn152
> wn153
> wn153
>
> when job is enqueued with a hard requirement for 4 cpus on 4 different
> nodes (nodes=4:ppn=1) than generated machinefile looks like:
>
> wn152
> wn153
> wn154
> wn155
>
> and jobs run perfectly well.
>
> is it a bug or feature? ;)
> any way to avoid this?
>
> thanks in advance, P
> --
> Pawel Dziekonski <pawel.dziekonski at wcss.pl>
> Wroclaw Centre for Networking & Supercomputing, HPC Department
> Politechnika Wr., pl. Grunwaldzki 9, bud. D2/101, 50-377 Wroclaw, POLAND
> phone: +48 71 3202043, fax: +48 71 3225797, http://www.wcss.wroc.pl
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>



More information about the mvapich-discuss mailing list