[mvapich-discuss] error closing socket at end of mpirun_rsh
original posted Oct 11
Dhabaleswar Panda
panda at cse.ohio-state.edu
Sat Jan 12 09:24:50 EST 2008
Hi Scott,
As we discussed off-line, you have access to a solution to this problem.
Let us know how it works. This solution is also being available with the
enhanced and strengthened mpirun_rsh of mvapich 1.0 version.
Thanks,
DK
On Wed, 9 Jan 2008, Scott Shaw wrote:
> Hi,
> On several clusters we are experiencing the same issues originally
> posted on Oct 11, 2007 regarding "error closing socket at end of
> mpirun_rsh" job. Running the mpi test with one core works, no error is
> generated but n+1 cores error is generated.
>
> Is there a patch available which addresses the "Termination socket read
> failed" error message? I have tested three different clusters and each
> cluster exhibits the same error. I also check the "mvapich-discuss"
> archives and still did not see a resolution.
>
> I am currently running mvapich v0.9.9 which is bundled with ofed v1.2.
>
> r1i0n0 /store/sshaw> mpirun_rsh -np 1 -hostfile ./hfile ./mpi_test
> Rank=0 present and calling MPI_Finalize
> Rank=0 bailing, nicely
>
> r1i0n0 /store/sshaw> mpirun_rsh -np 2 -hostfile ./hfile ./mpi_test
> Rank=1 present and calling MPI_Finalize
> Rank=0 present and calling MPI_Finalize
> Rank=0 bailing, nicely
> Termination socket read failed: Bad file descriptor
> Rank=1 bailing, nicely
>
> r1i0n0 /store/sshaw> mpirun_rsh -np 4 -hostfile ./hfile ./mpi_test
> Rank=1 present and calling MPI_Finalize
> Rank=3 present and calling MPI_Finalize
> Rank=0 present and calling MPI_Finalize
> Rank=2 present and calling MPI_Finalize
> Rank=0 bailing, nicely
> Termination socket read failed: Bad file descriptor
> Rank=3 bailing, nicely
> Rank=1 bailing, nicely
> Rank=2 bailing, nicely
>
> Thanks,
> Scott
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
More information about the mvapich-discuss
mailing list