[mvapich-discuss] Solaris x86

Di Domenico, Michael mdidomenico at silverstorm.com
Tue Apr 18 13:58:40 EDT 2006


Lei,

I appreciate your help on this... thanks

bash-3.00# pwd
/root

bash-3.00# ls /root
mvapich-0.9.7         mvapich-0.9.7.tar.gz  print.pl

bash-3.00# ls /root/mvapich-0.9.7
a.out                        install-mine.log
mpich.static.dsw
acconfig.h                   installtest                  mpichconf.h
aclocal.m4                   lib                          mpichconf.h.in
aclocal_tcl.m4               LICENSE.TXT                  mpichversion.o
bin                          make-mine.log                mpid
buildmsg                     make.mvapich.def
multirail.mpd.sh
ccbugs                       make.mvapich.gen2            mvapich.mpd.sh
config-mine.log              make.mvapich.gen2_multirail  osu_benchmarks
config.log                   make.mvapich.tcp             README
config.status                make.mvapich.udapl           README_MPICH
configure                    make.mvapich.vapi            romio
configure.in                 make.mvapich.vapi_multirail  sbin
COPYRIGHT                    Makefile                     share
COPYRIGHT_MVAPICH            Makefile.in                  src
doc                          makelinks                    util
etc                          man                          www
examples                     mpe                          www.index
f90modules                   MPI-2-C++
include                      mpich.dsw

bash-3.00# /root/mvapich-0.9.7/bin/mpirun_rsh -np 2 tse41 tse42
DAPL_PROVIDER=ibd0 /opt/mvapich/examples/cpi
/usr/bin/env: No such file or directory
/usr/bin/env: No such file or directory

Changing the command line to

bash-3.00# /root/mvapich-0.9.7/bin/mpirun -np 2 -machinefile
/opt/mvapich/share/machines.udapl /opt/mvapich/examples/cpi
[0] Abort: cannot open IA at line 214 in file viainit.c
mpirun: executable version 0 does not match our version 3.
done.

Applying the patch provided...

bash-3.00# /root/mvapich-0.9.7/bin/mpirun_rsh -np 2 -hostfile
/opt/mvapich/share/machines.udapl /opt/mvapich/examples/cpi 
/usr/bin/env: No such file or directory
sh: /root/mvapich-0.9.7: does not exist

-----Original Message-----
From: LEI CHAI [mailto:chai.15 at osu.edu] 
Sent: Tuesday, April 18, 2006 1:20 PM
To: Di Domenico, Michael
Cc: mvapich-discuss at cse.ohio-state.edu
Subject: Re: RE: RE: RE: [mvapich-discuss] Solaris x86

Michael,

We also have Solaris/X86 in our lab, and we have tested MVAPICH on
Solaris and didn't have this problem. For the time being, could you use
$COMPILE_PATH/bin/mpirun_rsh instead of $INSTALL/bin/mpirun_rsh?
$COMPILE_PATH is the directory of MVAPICH source code:

$COMPILE_PATH/bin/mpirun_rsh -np 2 node1 node2 DAPL_PROVIDER=ibd0 ./cpi

If you still see the "cannot open IA" problem, could you apply the patch
below and let us know the output? The patch is just to print out the
IAname.

Thanks.
Lei

-------------------------------------------------
--- viainit.c.orig      Tue Apr 18 13:04:10 2006
+++ viainit.c.new       Tue Apr 18 13:05:26 2006
@@ -211,7 +211,7 @@
                        &async_evd_handle, &viadev.nic);
     if (ret != DAT_SUCCESS)
       {
-          udapl_error_abort (GEN_EXIT_ERR, "cannot open IA");
+          udapl_error_abort (GEN_EXIT_ERR, "cannot open IA: %s",
dapl_provider);
       }

     viadev.maxtransfersize = viadev_max_rdma_size;

-----------------------------------------------

----- Original Message -----
From: "Di Domenico, Michael" <mdidomenico at silverstorm.com>
Date: Tuesday, April 18, 2006 12:29 pm
Subject: RE: RE: RE: [mvapich-discuss] Solaris x86

> Lei,
> 
> Something must not be moved correctly during the install process of 
> themake script and is corrupting the executable...  more then 
> likely I
> would personally suspect is that a tool your using to move the 
> files is
> different on solaris then it is on linux....
> 
> I've also added DAPL_PROVIDER to the ~/.bashrc and ~/.profile 
> files.  If
> I ssh from one machine to another it does get set, as evidenced by 
> echo$DAPL_PROVIDER...
> 
> 
> ...output truncated....
> installed MPICH in /opt/mvapich
> /opt/mvapich/sbin/mpiuninstall may be used to remove the installation.
> Congratulations on successfully building MVAPICH. Please send your
> feedback to mvapich-help at cse.ohio
> -state.edu.
> bash-3.00# /opt/mvapich/bin/mpirun_rsh
> bash: /opt/mvapich/bin/mpirun_rsh: Invalid argument
> bash-3.00# file /opt/mvapich/bin/mpirun_rsh
> can't read ELF header
> /opt/mvapich/bin/mpirun_rsh:
> bash-3.00#
> 
> -----Original Message-----
> From: LEI CHAI [chai.15 at osu.edu] 
> Sent: Tuesday, April 18, 2006 12:01 PM
> To: Di Domenico, Michael
> Cc: mvapich-discuss at cse.ohio-state.edu
> Subject: Re: RE: RE: [mvapich-discuss] Solaris x86
> 
> Michael,
> 
> One small thing, please make sure to export DAPL_PROVIDER in the 
> .bashrcfile instead of export it in the current shell. Export it in 
> the current
> shell does not help and we are taking a look at it.
> 
> Also, we do not understand why you need to copy mpirun_rsh from
> mpid/udapl/process. If you run mvapich-0.9.7/make.mvapich.udapl to
> rebuild mvapich , mpirun_rsh should be generated automatically in your
> $INSTALL/bin directory. Could you just run $INSTALL/bin/mpirun_rsh
> without any argument and let us know the result?
> 
> Thanks.
> Lei
> 
> 
> ----- Original Message -----
> From: "Di Domenico, Michael" <mdidomenico at silverstorm.com>
> Date: Tuesday, April 18, 2006 11:20 am
> Subject: RE: RE: [mvapich-discuss] Solaris x86
> 
> > Lei,
> > 
> > I had a feeling you were going to say that... See my outputs 
> below. 
> > The
> > IB card is definitely up, it's detected successfully by the 
> kernel 
> > and I
> > can run mvapich using IP over IB with no issues...
> > 
> > bash-3.00# tail /etc/dat/dat.conf
> > ....output truncated....
> > # IAname version threadsafe default library-path provider-version \
> > #       instance-data platform-information
> > #
> > ibd0  u1.2  nonthreadsafe  default  udapl_tavor.so.1  SUNW.1.0  " "
> > "driver_name=tavor"
> > 
> > bash-3.00# ifconfig ibd0
> > ibd0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 2044 
> > index3
> >        inet 192.168.101.41 netmask ffffff00 broadcast 
> 192.168.101.255>        ipib 
> 0:0:4:6:fe:80:0:0:0:0:0:0:0:6:6a:0:a0:0:3:a1> 
> > bash-3.00# ping 192.168.101.41
> > 192.168.101.41 is alive (tse41-ib)
> > bash-3.00# ping 192.168.101.42
> > 192.168.101.42 is alive (tse42-ib)
> > 
> > bash-3.00# echo $DAPL_PROVIDER
> > ibd0
> > bash-3.00#
> > 
> > -----Original Message-----
> > From: LEI CHAI [chai.15 at osu.edu] 
> > Sent: Tuesday, April 18, 2006 11:13 AM
> > To: Di Domenico, Michael
> > Cc: mvapich-discuss at cse.ohio-state.edu
> > Subject: Re: RE: [mvapich-discuss] Solaris x86
> > 
> > Michael,
> > 
> > There are several possible reasons that you see this error:
> > 
> > 1. There is no valid entry in /etc/dat/dat.conf
> > 
> > 2. There is no "export DAPL_PROVIDER=ibd0" in your .bashrc file, or
> > "source ~/.bashrc" was not done if you were already in the shell.
> > 
> > 3. InfiniBand on the node is not working properly.
> > 
> > I guess you have taken care of 1 and 2. For 3, could you do a 
> > "ifconfigibd0" and let us know the output?
> > 
> > Thanks.
> > Lei
> > 
> > 
> > ----- Original Message -----
> > From: "Di Domenico, Michael" <mdidomenico at silverstorm.com>
> > Date: Tuesday, April 18, 2006 9:24 am
> > Subject: RE: [mvapich-discuss] Solaris x86
> > 
> > > Lei,
> > > 
> > > 
> > > 
> > > I copied mpirun_rsh from the mpid/udal/process directory which 
> > > seems to
> > > be a valid executable, and now I get
> > > 
> > > 
> > > 
> > > bash-3.00# ./mpirun -hostfile ../share/machines.udapl ./cpi
> > > 
> > > [0] Abort: cannot open IA at line 214 in file viainit.c
> > > 
> > > mpirun: executable version 0 does not match our version 3.
> > > 
> > > done.
> > > 
> > > 
> > > 
> > > ________________________________
> > > 
> > > From: lei chai [chai.15 at osu.edu] 
> > > Sent: Monday, April 17, 2006 10:14 PM
> > > To: Di Domenico, Michael; mvapich-discuss at cse.ohio-state.edu
> > > Subject: Re: [mvapich-discuss] Solaris x86
> > > 
> > > 
> > > 
> > > Michael,
> > > 
> > > 
> > > 
> > > Thanks for reporting the mpirun problem. We have now fixed it. 
> > > Please go
> > > to your mvapich-0.9.7/mpid/udapl directory, change the names of 
> the> > files mpirun.vapi.args and mpirun.vapi.in to 
> mpirun.udapl.args and
> > > mpirun.udapl.in. Then replace "vapi" in
> > > mvapich-0.9.7/mpid/udapl/mpirun.lst to "udapl". You also need 
> to add
> > > "export DAPL_PROVIDER=ibd0" to your .bashrc file. After 
> rebuild, you
> > > could run a program:
> > > 
> > > 
> > > 
> > > mpirun -n 2 -machinefile my-machine-file ./cpi
> > > 
> > > 
> > > 
> > > where my-machine-file contains host names.
> > > 
> > > 
> > > 
> > > We have never had problem with mpirun_rsh before. Please follow 
> > Matt's> suggestion and let us know the result.
> > > 
> > > 
> > > 
> > > Thanks.
> > > 
> > > Lei
> > > 
> > > 
> > > 
> > > 	----- Original Message ----- 
> > > 
> > > 	From: Di Domenico, Michael <')" >mdidomenico at silverstorm.com>
> > > 
> > > 
> > > 	To: lei chai <')" >chai.15 at osu.edu>  ;
> > > mvapich-discuss at cse.ohio-state.edu 
> > > 
> > > 	Sent: Monday, April 17, 2006 5:00 PM
> > > 
> > > 	Subject: RE: [mvapich-discuss] Solaris x86
> > > 
> > >         
> > > 
> > > 	Lei,
> > > 
> > >         
> > > 
> > > 	Thanks for the reply, but it still doesn't work...
> > > 
> > >         
> > > 
> > > 	--- first try with mpirun_rsh 
> > > 
> > >         
> > > 
> > > 	bash-3.00# /opt/mvapich/bin/mpirun_rsh -np 2 tse41-ib tse42-ib
> > > DAPL_PROVIDER="ibd0" ./cpi
> > > 
> > > 	bash: /opt/mvapich/bin/mpirun_rsh: Invalid argument
> > > 
> > >         
> > > 
> > > 	--- second try with mpirun (just to see what happens)
> > > 
> > >         
> > > 
> > > 	bash-3.00# /opt/mvapich/bin/mpirun -np 2 tse41-ib tse42-ib
> > > DAPL_PROVIDER="ibd0" ./cpi
> > > 
> > > 	Warning: Command line arguments for program should be given
> > > 
> > > 	after the program name.  Assuming that tse42-ib is a
> > > 
> > > 	command line argument for the program.
> > > 
> > > 	Warning: Command line arguments for program should be given
> > > 
> > > 	after the program name.  Assuming that DAPL_PROVIDER=ibd0 is a
> > > 
> > > 	command line argument for the program.
> > > 
> > > 	Unrecognized argument tse41-ib ignored.
> > > 
> > > 	Cannot find MPIRUN machine file for machine udapl
> > > 
> > > 	and architecture solaris86 .
> > > 
> > > 	(No device specified.)
> > > 
> > > 	bash-3.00#
> > > 
> > >         
> > > 
> > > 	
> > > ________________________________
> > > 
> > > 
> > > 	From: lei chai [chai.15 at osu.edu] 
> > > 	Sent: Monday, April 17, 2006 4:50 PM
> > > 	To: Di Domenico, Michael; mvapich-discuss at cse.ohio-state.edu
> > > 	Subject: Re: [mvapich-discuss] Solaris x86
> > > 
> > >         
> > > 
> > > 	Hi,
> > > 
> > >         
> > > 
> > > 	Thank you for trying out MVAPICH-0.9.7. Please use mpirun_rsh
> > > instead of mpirun. And for using the uDAPL device, please 
> specify an
> > > IAname, e.g.
> > > 
> > >         
> > > 
> > > 	/opt/mvapich/bin/mpirun_rsh -np 2 node1 node2
> > > DAPL_PROVIDER="IAname" ./cpi
> > > 
> > >         
> > > 
> > > 	The IAname can be found in /etc/dat/dat.conf, it is the first
> > > field.
> > > 
> > >         
> > > 
> > > 	Hope this helps.
> > > 
> > >         
> > > 
> > > 	Regards,
> > > 
> > > 	Lei
> > > 
> > >         
> > > 
> > >        	----- Original Message ----- 
> > > 
> > >        	From: Di Domenico, Michael
> > > <')" >mdidomenico at silverstorm.com>  
> > > 
> > >        	To: mvapich-discuss at cse.ohio-state.edu 
> > > 
> > >        	Sent: Monday, April 17, 2006 4:06 PM
> > > 
> > >        	Subject: [mvapich-discuss] Solaris x86
> > > 
> > >                 
> > > 
> > >        	I'm trying to get Mvapich 0.9.7 to compile and run on
> > > Solaris 10 1/06 x86 using the GNU toolset downloaded from
> > > sunfreeware.com...
> > > 
> > >                 
> > > 
> > >        	I'm attaching the outputs from ./make.mvapich.udapl.
> > > 
> > >                 
> > > 
> > >        	Everything seems to compile, but I don't ever seem to
> > > get a mpirun.udapl file...  Any clue's that I missed from the make
> > > outputs?
> > > 
> > >                 
> > > 
> > >        	bash-3.00# cd /opt/mvapich/examples/
> > > 
> > >        	bash-3.00# ls
> > > 
> > >        	cpi          cpi.o        cpip.c       Makefile
> > > MPI-2-C++    README
> > > 
> > >        	cpi.c        cpilog.c     hello++.cc   Makefile.in
> > > mpirun       simpleio.c
> > > 
> > >        	bash-3.00# ./mpirun ./cpi
> > > 
> > >        	Cannot find MPIRUN machine file for machine udapl
> > > 
> > >        	and architecture solaris86 .
> > > 
> > >        	(No device specified.)
> > > 
> > >        	bash-3.00# sh -x ./mpirun ./cpi
> > > 
> > >        	....output truncated....
> > > 
> > >        	+ [ -x /opt/mvapich/bin/mpirun.udapl ] 
> > > 
> > >        	+ echo Cannot find MPIRUN machine file for machine udapl
> > > 
> > > 
> > >        	Cannot find MPIRUN machine file for machine udapl
> > > 
> > >        	+ echo and architecture solaris86 . 
> > > 
> > >        	and architecture solaris86 .
> > > 
> > >        	+ [ -n  ] 
> > > 
> > >        	+ echo (No device specified.) 
> > > 
> > >        	(No device specified.)
> > > 
> > >        	+ exit 1 
> > > 
> > >        	bash-3.00# ls /opt/mvapich/bin
> > > 
> > >        	mpiCC                 mpiman                mpirun.args
> > > mpirun_dbg.ddd        mpirun_dbg.xxgdb
> > > 
> > >        	mpicc                 mpireconfig           mpirun.vapi
> > > mpirun_dbg.gdb        mpirun_rsh
> > > 
> > >        	mpichversion          mpireconfig.dat
> > > mpirun.vapi.args      mpirun_dbg.ladebug    tarch
> > > 
> > >        	mpicxx                mpirun
> > > mpirun_dbg.dbx        mpirun_dbg.totalview  tdevice
> > > 
> > >        	
> > > ________________________________
> > > 
> > > 
> > >        	_______________________________________________
> > >        	mvapich-discuss mailing list
> > >        	mvapich-discuss at cse.ohio-state.edu
> > > 	
> > > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> > > 
> > > 
> > 
> 
> 




More information about the mvapich-discuss mailing list