[mvapich-discuss] mpiexec.hydra hangs
Igor Podladtchikov
igor.podladtchikov at spectraseis.com
Mon Jul 8 18:56:37 EDT 2013
So I try the prefix thing as AD user? Or as root?
here's my program's ldd output..
ldd /spectraseis/share/applications/s6fd/current/s6fd
linux-vdso.so.1 => (0x00007fffd3bff000)
libdl.so.2 => /lib64/libdl.so.2 (0x0000003a1c400000)
libm.so.6 => /lib64/libm.so.6 (0x0000003a1d000000)
libs6fd.so => /spectraseis/share/applications/s6fd/current/libs6fd.so (0x00007f356c363000)
libcudart.so.4 => /spectraseis/share/applications/s6fd/current/lib/libcudart.so.4 (0x00007f356c108000)
libfdcoords.so => /spectraseis/share/applications/s6fd/current/lib/libfdcoords.so (0x00007f356bf03000)
libfddomain.so => /spectraseis/share/applications/s6fd/current/lib/libfddomain.so (0x00007f356bcfb000)
libfdfiles.so => /spectraseis/share/applications/s6fd/current/lib/libfdfiles.so (0x00007f356baf9000)
libfdindex.so => /spectraseis/share/applications/s6fd/current/lib/libfdindex.so (0x00007f356b8f6000)
libfdinjrec.so => /spectraseis/share/applications/s6fd/current/lib/libfdinjrec.so (0x00007f356b6f0000)
libfdlogger.so => /spectraseis/share/applications/s6fd/current/lib/libfdlogger.so (0x00007f356b4ed000)
libfdmat.so => /spectraseis/share/applications/s6fd/current/lib/libfdmat.so (0x00007f356b2e7000)
libfdmp.so => /spectraseis/share/applications/s6fd/current/lib/libfdmp.so (0x00007f356b0e2000)
libfdsolvers.so => /spectraseis/share/applications/s6fd/current/lib/libfdsolvers.so (0x00007f356aecf000)
libfdsplit.so => /spectraseis/share/applications/s6fd/current/lib/libfdsplit.so (0x00007f356acc9000)
libfdmon.so => /spectraseis/share/applications/s6fd/current/lib/libfdmon.so (0x00007f356aac1000)
libfdui.so => /spectraseis/share/applications/s6fd/current/lib/libfdui.so (0x00007f356a8a4000)
libfdtest.so => /spectraseis/share/applications/s6fd/current/lib/libfdtest.so (0x00007f356a69b000)
libfdpick.so => /spectraseis/share/applications/s6fd/current/lib/libfdpick.so (0x00007f356a495000)
libfdsolversgpu.so => /spectraseis/share/applications/s6fd/current/lib/libfdsolversgpu.so (0x00007f356a292000)
libfdsolversgpu_cuda.so => /spectraseis/share/applications/s6fd/current/lib/libfdsolversgpu_cuda.so (0x00007f356a08b000)
libfdel3dcpu.so => /spectraseis/share/applications/s6fd/current/lib/libfdel3dcpu.so (0x00007f3569e7c000)
libfdel3dgpu.so => /spectraseis/share/applications/s6fd/current/lib/libfdel3dgpu.so (0x00007f3569c76000)
libfdel3dgpu_cuda.so => /spectraseis/share/applications/s6fd/current/lib/libfdel3dgpu_cuda.so (0x00007f3569a1f000)
libfdac3dcpu.so => /spectraseis/share/applications/s6fd/current/lib/libfdac3dcpu.so (0x00007f3569819000)
libfdac3dgpu.so => /spectraseis/share/applications/s6fd/current/lib/libfdac3dgpu.so (0x00007f3569614000)
libfdac3dgpu_cuda.so => /spectraseis/share/applications/s6fd/current/lib/libfdac3dgpu_cuda.so (0x00007f35693eb000)
libspeclib.so => /spectraseis/share/applications/speclib/current/libspeclib.so (0x00007f35691ba000)
libpthread.so.0 => /lib64/libpthread.so.0 (0x0000003a1cc00000)
librt.so.1 => /lib64/librt.so.1 (0x0000003a1d800000)
libc.so.6 => /lib64/libc.so.6 (0x0000003a1c800000)
/lib64/ld-linux-x86-64.so.2 (0x0000003a1c000000)
libstdc++.so.6 => /usr/lib64/libstdc++.so.6 (0x0000003a23400000)
libgcc_s.so.1 => /lib64/libgcc_s.so.1 (0x0000003a20c00000)
sorry bout that
Igor Podladtchikov
Spectraseis
1899 Wynkoop St, Suite 350
Denver, CO 80202
Tel. +1 303 658 9172 (direct)
Tel. +1 303 330 8296 (cell)
www.spectraseis.com
________________________________________
From: Jonathan Perkins
Sent: Monday, July 08, 2013 4:29 PM
To: Igor Podladtchikov
Cc: mvapich-discuss at cse.ohio-state.edu
Subject: Re: [mvapich-discuss] mpiexec.hydra hangs
Thanks for the reply. My responses are inline.
On Mon, Jul 08, 2013 at 10:10:34PM +0000, Igor Podladtchikov wrote:
> Hi Jonathan,
>
> thanks for your reply.
>
> which mpiexec.hydra
> /usr/local/bin/mpiexec.hydra
>
> which mpirun_rsh
> /usr/local/bin/mpirun_rsh
Okay, it looks like the mvapich2 installation is in /usr/local and
mpiexec.hydra likely comes from that installation.
> ldd /usr/local/bin/mpiexec.hydra
> <snip>
Actually can you provide the ldd output of the mpi program that you're
trying to run and not the launcher.
> On the working system, mpiexec.hydra gives the same output, but
> mpirun_rsh doesn't seem to have libstdc++.so.6 and libgcc_s.so.1..
> Also, if I do ldd mpirun_rsh as root, I don't get stdc++ and gcc_s
> either...
>
> The other thing that I noticed is, if I tell my program not to create
> a context on the GPUs, mpiexec.hydra still hangs, but doesn't produce
> all those messages if I cancel it with Ctrl+C. I am not setting
> MV2_USE_CUDA, so it shouldn't matter?
Since you're not using MV2_USE_CUDA I do not think that this will
matter.
> And the last thing you may need to know is, for some reason unknown to
> me, root on the compute node can't modify files generated by Active
> Directory users.. so I copied the tarball from the shared location
> (owned by active directory user) to root's home on the compute node,
> where I did configure; make; make install as root. It hangs for both
> root and AD user though.. on the node where everything is OK, I first
> tried to configure and make as AD user, then failed to make install
> because I can't modify /usr/... as AD user. I also failed to make
> install as root, because I couldn't modify some files in the make dir.
> So I ended up copying everything to root'sh home on the compute node
> and doing everything from there.. would that be a problem, you think?
Something funky could have happened here. This seems to be a gray area,
maybe you can try the process over again except setting the prefix to
something else like
./configure --prefix=/usr/local/mvapich2-1.9/ ...
Then you can check to see if you have the same problem with this
installation.
> I never seem to have quite understood how mpirun_rsh works.. it gives
> me "incorrect number of arguments" when I do mpirun_rsh -n 7 <prog +
> args>.
You can try the following link as a primer:
http://mvapich.cse.ohio-state.edu/support/mvapich2-1.9-quick-start.html
Have you tried the osu-micro-benchmarks? See if you can run the
following command.
mpirun_rsh -n 2 localhost localhost /usr/local/libexec/mvapich2/osu_latency
--
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo
More information about the mvapich-discuss
mailing list