[mvapich-discuss] No execution with mvapich-gen2

Abhinav Vishnu vishnu at cse.ohio-state.edu
Thu Mar 30 14:27:05 EST 2006


Hi Owen,

Thanks for sending the trace.

However, the first line of the trace suggests that you are trying
to run the processes on different nodes:

execve("/usr/local/mvapich/bin/mpirun_rsh", ["mpirun_rsh", "-np", "2",
"node-192-168-111-248", "node-192-168-111-249", "/home/cpi"], [/* 22 vars
*/]) = 0                                  ^^^

compared to:

[root at node-192-168-111-248 ~]# mpirun_rsh -np 2 192.168.111.248
 192.168.111.248 /home/cpi
             ^^^

Also, we were wondering if the ibv_*_* tests were done using two processes
on one machine or two different machines.

Please let us know.

thanks and regards,

-- Abhinav
-------------------------------
Abhinav Vishnu,
Graduate Research Associate,
Department Of Comp. Sc. & Engg.
The Ohio State University.
-------------------------------

On Tue, 28 Mar 2006, Owen Stampflee wrote:

> Yeah, same problem w/0.9.7 as we were having previously...
>
> I've set ulimit -l to be unlimited, ibv_*_pingpong tests work...
>
> [root at node-192-168-111-248 ~]# mpirun_rsh -np 2 192.168.111.248
> 192.168.111.248 /home/cpi
> mpirun: executable version 0 does not match our version 3.
> done.
>
> but it works on a single node
> [root at node-192-168-111-248 ~]# mpirun_rsh -np 1
> 192.168.111.248 /home/cpi
> Process 0 on node-192-168-111-248
> pi is approximately 3.1416009869231254, Error is 0.0000083333333323
> wall clock time = 0.000260
>
> Here is teh strace -f of the mpirun_rsh that doesnt execute properly:
> http://cvs.terraplex.com/~owen/mpirun_rsh.strace
>
> Cheers,
> Owen
>
> On Tue, 2006-03-21 at 21:11 -0500, Abhinav Vishnu wrote:
> > Hi Owen,
> >
> > > Should 0.9.7 work properly with OpenIB 1.0rc1?
> >
> > There should be no problems running 0.9.7 with OpenIB 1.0rc1.
> > Please let us know if you face any problems.
> >
> > Thanks,
> >
> > -- Abhinav
> > >
> > > Thanks,
> > > Owen
> > >
> > > On Tue, 2006-03-21 at 18:10 -0500, Abhinav Vishnu wrote:
> > > > Owen,
> > > >
> > > > Thanks for your mail.
> > > >
> > > > We are looking forward to your experience with 0.9.7.
> > > > Please keep up posted.
> > > >
> > > > Thanks,
> > > >
> > > > -- Abhinav
> > > >
> > > > -------------------------------
> > > > Abhinav Vishnu,
> > > > Graduate Research Associate,
> > > > Department Of Comp. Sc. & Engg.
> > > > The Ohio State University.
> > > > -------------------------------
> > > >
> > > > On Tue, 21 Mar 2006, Owen Stampflee wrote:
> > > >
> > > > > Sorry for the lenghty delay, here's where I'm at currently:
> > > > >
> > > > > On Tue, 2006-02-21 at 23:14 -0500, Sayantan Sur wrote:
> > > > > > Owen,
> > > > > >
> > > > > > Thanks for trying out mvapich-gen2. Sorry to know about your problems.
> > > > > > Hopefully, we can resolve this issue quickly.
> > > > > >
> > > > > > > I get the following output running the cpi example...
> > > > > > >
> > > > > > > [root at m2 examples]# mpirun_rsh -np 1 localhost ./cpi
> > > > > > > mpirun: executable version 0 does not match our version 2.
> > > > > > > done.
> > > > > >
> > > > > > Could you tell us what happens if you do:
> > > > > >
> > > > > > $ mpirun_rsh -np 1 m2 ./cpi
> > > > > Same result, application doesnt run.
> > > > >
> > > > > > > I'm using openib svn5411, and mvapich-gen2-1.0 with the 101, 104, 105,
> > > > > > > 106. Oddly enough even with the recent 5411 version of openib, patch 103
> > > > > > > (CQ creation) doesnt compile.
> > > > > >
> > > > > > If you've been following the OpenIB mailing list, then you must be aware
> > > > > > of this. Sometime last October, the ibv_create_cq (which is the Gen2
> > > > > > interface) verb arguments changed. To work around this interface change,
> > > > > > we introduced patch #103 which uses the new verb by DEFAULT.
> > > > > >
> > > > > > So, if you are at patch level 106, then you do NOT need to specify
> > > > > > -DGEN2_OLD_CQ_VERB. Just using the default mvapich.make.gcc should be
> > > > > > enough.
> > > > > >
> > > > > > Just for clarification, can you send us the compilation failure you get
> > > > > > with patch #106? Also, if you just download the integrated tarball from
> > > > > > the mvapich-gen2 download page (instead of applying all patches by
> > > > > > hand), do you still get the same results?
> > > > > I'll follow this up in a 2nd email about my mvapich-0.97 compile results.
> > > > >
> > > > > > > I'm building everything in 32-bit mode, and using -D_IA32_ (those look
> > > > > > > fairly sane but I could have missed something).
> > > > > > >
> > > > > > > All the OpenIB pingpong tests are fine, I'm really quite stumped on
> > > > > > > where to go from here.
> > > > > >
> > > > > > Gen2 uses the lockable memory limits set by the system administrator. In
> > > > > > order to use MVAPICH, you must set this parameter to `unlimited' or to a
> > > > > > larger memory size so that MVAPICH is able to register communication
> > > > > > buffers. This is common of all MPI and other higher level software on
> > > > > > top of Gen2.
> > > > > >
> > > > > > There are three steps to setting up the lockable memory privileges for
> > > > > > users:
> > > > > >
> > > > > > 1) In /etc/security/limits.conf: Add a line
> > > > > >
> > > > > > *               soft    memlock         unlimited
> > > > > >
> > > > > > 2) In /etc/init.d/sshd: Add a line
> > > > > >
> > > > > > ulimit -l unlimited
> > > > > >
> > > > > > 3) Restart sshd
> > > > > >
> > > > > > /etc/init.d/sshd restart
> > > > > >
> > > > > > All subsequent SSH sessions by users should have this new lockable
> > > > > > memory limit set. To verify this, you can do:
> > > > > >
> > > > > > $ ssh node1 ulimit -l
> > > > > >
> > > > > > If this shows unlimited, then the setup was OK.
> > > > > >
> > > > > > Please let us know if this was able to resolve your problems.
> > > > > Tried this as well, same result... application doesnt run. I'm currently attempting to get OpenIB 1.0rc1 running as well mvapich-0.97
> > > > >
> > > > > Cheers,
> > > > > Owen
> > > > >
> > > > > _______________________________________________
> > > > > mvapich-discuss mailing list
> > > > > mvapich-discuss at cse.ohio-state.edu
> > > > > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> > > > >
> > > >
> > > >
> > > >
> > >
> >
> >
> > !DSPAM:4420b26e179841771250337!
>



More information about the mvapich-discuss mailing list