[mvapich-discuss] mpiexec and mpirun_rsh as non Root issues
Diego Humberto Kalegari
kalegari at lactec.org.br
Wed Aug 17 12:04:12 EDT 2011
Hello Jonathan
I shared the MVAPCHI2 from dvse-cluster to all nodes, and I added the path in the $PATH of all nodes.
This is path where MVAPICH2 is installed -> home/MPI/bin:/
echo $PATH
/home/l0626/bin:/home/MPI/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/usr/lib/mit/bin:/usr/lib/mit/sbin
l0626 at n01:~>
This is the ssh hostname result to some of my nodes
l0626 at n01:~> ssh n03 hostname
n03
l0626 at n02:~> ssh n03 hostname
n03
l0626 at n02:~>
Running the following commands as root works.
mpirun_rsh -np 11 -hostfile hosts ./DETest 55 1000 10000 sequence.txt 10 0.8 0
Fatal error in MPI_Init:
Other MPI error
Fatal error in MPI_Init:
Other MPI error
cannot create cq
cannot create cq
Fatal error in MPI_Init:
Other MPI error
but as l0626 for examples it does not. But if I run it only in a sigle node it works. Bellow are the logs
This command also work as root
mpirun -hosts n01:24,n03:24,n04:24,n05:24,n06:24,n07:24,n08:24,n09:24,n10:24,n11:24 -np 240 ./DETest 13 2300 1000 sequence.txt 230 0.8 0
but not as l0626 bellow are the logs
Initializing MPI
Initializing MPI
Fatal error in MPI_Init:
Other MPI error
Fatal error in MPI_Init:
Other MPI error
Thanks,
I really appreciate your help.
Diego
________________________________________
From: Jonathan Perkins [perkinjo at cse.ohio-state.edu]
Sent: Wednesday, August 17, 2011 11:52 AM
To: Diego Humberto Kalegari
Cc: mvapich-discuss at cse.ohio-state.edu
Subject: Re: [mvapich-discuss] mpiexec and mpirun_rsh as non Root issues
Hello,
To determine what may be wrong I'll just ask that you double check on
a few things. Are you just trying to run a 2 process job? If so, are
you able to log in as the user mpiexec on the host named dvse-cluster
and then `ssh second_hostname'? If so, have you installed mvapich2 on
all machines in the same location, or use a shared filesystem?
If any of these things fail please send back the full failed message
included the command that caused the failure.
On Wed, Aug 17, 2011 at 10:12 AM, Diego Humberto Kalegari
<kalegari at lactec.org.br> wrote:
> Hello All,
>
> I'm trying to set up a environment with MVAPICH2.
>
> I was installed this and made all the configurations required if ssh to avoid asking for a user specific PWD.. When I do a ssh to another system with any user it logs in automatically
>
> I was successfully able to run mpiexec and mpirun_rsh as root user.. But when I try to run i as anoher user, any toher in my system I can't it gives me the following
>
> Fatal error in MPI_Init:
> Other MPI error
>
> [mpiexec at dvse-cluster] control_cb (./pm/pmiserv/pmiserv_cb.c:215): assert (!closed) failed
> [mpiexec at dvse-cluster] HYDT_dmxu_poll_wait_for_event (./tools/demux/demux_poll.c:77): callback returned error status
> [mpiexec at dvse-cluster] HYD_pmci_wait_for_completion (./pm/pmiserv/pmiserv_pmci.c:179): error waiting for event
> [mpiexec at dvse-cluster] main (./ui/mpich/mpiexec.c:397): process manager error waiting for completion
>
> Could someone please provide me with any support ?
>
> Best Regards
>
> Dieog
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
--
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo
More information about the mvapich-discuss
mailing list