[mvapich-discuss] Solaris x86

Di Domenico, Michael mdidomenico at silverstorm.com
Tue Apr 18 12:29:13 EDT 2006


Lei,

Something must not be moved correctly during the install process of the
make script and is corrupting the executable...  more then likely I
would personally suspect is that a tool your using to move the files is
different on solaris then it is on linux....

I've also added DAPL_PROVIDER to the ~/.bashrc and ~/.profile files.  If
I ssh from one machine to another it does get set, as evidenced by echo
$DAPL_PROVIDER...


...output truncated....
installed MPICH in /opt/mvapich
/opt/mvapich/sbin/mpiuninstall may be used to remove the installation.
Congratulations on successfully building MVAPICH. Please send your
feedback to mvapich-help at cse.ohio
-state.edu.
bash-3.00# /opt/mvapich/bin/mpirun_rsh
bash: /opt/mvapich/bin/mpirun_rsh: Invalid argument
bash-3.00# file /opt/mvapich/bin/mpirun_rsh
can't read ELF header
/opt/mvapich/bin/mpirun_rsh:
bash-3.00#

-----Original Message-----
From: LEI CHAI [mailto:chai.15 at osu.edu] 
Sent: Tuesday, April 18, 2006 12:01 PM
To: Di Domenico, Michael
Cc: mvapich-discuss at cse.ohio-state.edu
Subject: Re: RE: RE: [mvapich-discuss] Solaris x86

Michael,

One small thing, please make sure to export DAPL_PROVIDER in the .bashrc
file instead of export it in the current shell. Export it in the current
shell does not help and we are taking a look at it.

Also, we do not understand why you need to copy mpirun_rsh from
mpid/udapl/process. If you run mvapich-0.9.7/make.mvapich.udapl to
rebuild mvapich , mpirun_rsh should be generated automatically in your
$INSTALL/bin directory. Could you just run $INSTALL/bin/mpirun_rsh
without any argument and let us know the result?

Thanks.
Lei


----- Original Message -----
From: "Di Domenico, Michael" <mdidomenico at silverstorm.com>
Date: Tuesday, April 18, 2006 11:20 am
Subject: RE: RE: [mvapich-discuss] Solaris x86

> Lei,
> 
> I had a feeling you were going to say that... See my outputs below. 
> The
> IB card is definitely up, it's detected successfully by the kernel 
> and I
> can run mvapich using IP over IB with no issues...
> 
> bash-3.00# tail /etc/dat/dat.conf
> ....output truncated....
> # IAname version threadsafe default library-path provider-version \
> #       instance-data platform-information
> #
> ibd0  u1.2  nonthreadsafe  default  udapl_tavor.so.1  SUNW.1.0  " "
> "driver_name=tavor"
> 
> bash-3.00# ifconfig ibd0
> ibd0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 2044 
> index3
>        inet 192.168.101.41 netmask ffffff00 broadcast 192.168.101.255
>        ipib 0:0:4:6:fe:80:0:0:0:0:0:0:0:6:6a:0:a0:0:3:a1
> 
> bash-3.00# ping 192.168.101.41
> 192.168.101.41 is alive (tse41-ib)
> bash-3.00# ping 192.168.101.42
> 192.168.101.42 is alive (tse42-ib)
> 
> bash-3.00# echo $DAPL_PROVIDER
> ibd0
> bash-3.00#
> 
> -----Original Message-----
> From: LEI CHAI [chai.15 at osu.edu] 
> Sent: Tuesday, April 18, 2006 11:13 AM
> To: Di Domenico, Michael
> Cc: mvapich-discuss at cse.ohio-state.edu
> Subject: Re: RE: [mvapich-discuss] Solaris x86
> 
> Michael,
> 
> There are several possible reasons that you see this error:
> 
> 1. There is no valid entry in /etc/dat/dat.conf
> 
> 2. There is no "export DAPL_PROVIDER=ibd0" in your .bashrc file, or
> "source ~/.bashrc" was not done if you were already in the shell.
> 
> 3. InfiniBand on the node is not working properly.
> 
> I guess you have taken care of 1 and 2. For 3, could you do a 
> "ifconfigibd0" and let us know the output?
> 
> Thanks.
> Lei
> 
> 
> ----- Original Message -----
> From: "Di Domenico, Michael" <mdidomenico at silverstorm.com>
> Date: Tuesday, April 18, 2006 9:24 am
> Subject: RE: [mvapich-discuss] Solaris x86
> 
> > Lei,
> > 
> > 
> > 
> > I copied mpirun_rsh from the mpid/udal/process directory which 
> > seems to
> > be a valid executable, and now I get
> > 
> > 
> > 
> > bash-3.00# ./mpirun -hostfile ../share/machines.udapl ./cpi
> > 
> > [0] Abort: cannot open IA at line 214 in file viainit.c
> > 
> > mpirun: executable version 0 does not match our version 3.
> > 
> > done.
> > 
> > 
> > 
> > ________________________________
> > 
> > From: lei chai [chai.15 at osu.edu] 
> > Sent: Monday, April 17, 2006 10:14 PM
> > To: Di Domenico, Michael; mvapich-discuss at cse.ohio-state.edu
> > Subject: Re: [mvapich-discuss] Solaris x86
> > 
> > 
> > 
> > Michael,
> > 
> > 
> > 
> > Thanks for reporting the mpirun problem. We have now fixed it. 
> > Please go
> > to your mvapich-0.9.7/mpid/udapl directory, change the names of the
> > files mpirun.vapi.args and mpirun.vapi.in to mpirun.udapl.args and
> > mpirun.udapl.in. Then replace "vapi" in
> > mvapich-0.9.7/mpid/udapl/mpirun.lst to "udapl". You also need to add
> > "export DAPL_PROVIDER=ibd0" to your .bashrc file. After rebuild, you
> > could run a program:
> > 
> > 
> > 
> > mpirun -n 2 -machinefile my-machine-file ./cpi
> > 
> > 
> > 
> > where my-machine-file contains host names.
> > 
> > 
> > 
> > We have never had problem with mpirun_rsh before. Please follow 
> Matt's> suggestion and let us know the result.
> > 
> > 
> > 
> > Thanks.
> > 
> > Lei
> > 
> > 
> > 
> > 	----- Original Message ----- 
> > 
> > 	From: Di Domenico, Michael <')" >mdidomenico at silverstorm.com>
> > 
> > 
> > 	To: lei chai <')" >chai.15 at osu.edu>  ;
> > mvapich-discuss at cse.ohio-state.edu 
> > 
> > 	Sent: Monday, April 17, 2006 5:00 PM
> > 
> > 	Subject: RE: [mvapich-discuss] Solaris x86
> > 
> >         
> > 
> > 	Lei,
> > 
> >         
> > 
> > 	Thanks for the reply, but it still doesn't work...
> > 
> >         
> > 
> > 	--- first try with mpirun_rsh 
> > 
> >         
> > 
> > 	bash-3.00# /opt/mvapich/bin/mpirun_rsh -np 2 tse41-ib tse42-ib
> > DAPL_PROVIDER="ibd0" ./cpi
> > 
> > 	bash: /opt/mvapich/bin/mpirun_rsh: Invalid argument
> > 
> >         
> > 
> > 	--- second try with mpirun (just to see what happens)
> > 
> >         
> > 
> > 	bash-3.00# /opt/mvapich/bin/mpirun -np 2 tse41-ib tse42-ib
> > DAPL_PROVIDER="ibd0" ./cpi
> > 
> > 	Warning: Command line arguments for program should be given
> > 
> > 	after the program name.  Assuming that tse42-ib is a
> > 
> > 	command line argument for the program.
> > 
> > 	Warning: Command line arguments for program should be given
> > 
> > 	after the program name.  Assuming that DAPL_PROVIDER=ibd0 is a
> > 
> > 	command line argument for the program.
> > 
> > 	Unrecognized argument tse41-ib ignored.
> > 
> > 	Cannot find MPIRUN machine file for machine udapl
> > 
> > 	and architecture solaris86 .
> > 
> > 	(No device specified.)
> > 
> > 	bash-3.00#
> > 
> >         
> > 
> > 	
> > ________________________________
> > 
> > 
> > 	From: lei chai [chai.15 at osu.edu] 
> > 	Sent: Monday, April 17, 2006 4:50 PM
> > 	To: Di Domenico, Michael; mvapich-discuss at cse.ohio-state.edu
> > 	Subject: Re: [mvapich-discuss] Solaris x86
> > 
> >         
> > 
> > 	Hi,
> > 
> >         
> > 
> > 	Thank you for trying out MVAPICH-0.9.7. Please use mpirun_rsh
> > instead of mpirun. And for using the uDAPL device, please specify an
> > IAname, e.g.
> > 
> >         
> > 
> > 	/opt/mvapich/bin/mpirun_rsh -np 2 node1 node2
> > DAPL_PROVIDER="IAname" ./cpi
> > 
> >         
> > 
> > 	The IAname can be found in /etc/dat/dat.conf, it is the first
> > field.
> > 
> >         
> > 
> > 	Hope this helps.
> > 
> >         
> > 
> > 	Regards,
> > 
> > 	Lei
> > 
> >         
> > 
> >        	----- Original Message ----- 
> > 
> >        	From: Di Domenico, Michael
> > <')" >mdidomenico at silverstorm.com>  
> > 
> >        	To: mvapich-discuss at cse.ohio-state.edu 
> > 
> >        	Sent: Monday, April 17, 2006 4:06 PM
> > 
> >        	Subject: [mvapich-discuss] Solaris x86
> > 
> >                 
> > 
> >        	I'm trying to get Mvapich 0.9.7 to compile and run on
> > Solaris 10 1/06 x86 using the GNU toolset downloaded from
> > sunfreeware.com...
> > 
> >                 
> > 
> >        	I'm attaching the outputs from ./make.mvapich.udapl.
> > 
> >                 
> > 
> >        	Everything seems to compile, but I don't ever seem to
> > get a mpirun.udapl file...  Any clue's that I missed from the make
> > outputs?
> > 
> >                 
> > 
> >        	bash-3.00# cd /opt/mvapich/examples/
> > 
> >        	bash-3.00# ls
> > 
> >        	cpi          cpi.o        cpip.c       Makefile
> > MPI-2-C++    README
> > 
> >        	cpi.c        cpilog.c     hello++.cc   Makefile.in
> > mpirun       simpleio.c
> > 
> >        	bash-3.00# ./mpirun ./cpi
> > 
> >        	Cannot find MPIRUN machine file for machine udapl
> > 
> >        	and architecture solaris86 .
> > 
> >        	(No device specified.)
> > 
> >        	bash-3.00# sh -x ./mpirun ./cpi
> > 
> >        	....output truncated....
> > 
> >        	+ [ -x /opt/mvapich/bin/mpirun.udapl ] 
> > 
> >        	+ echo Cannot find MPIRUN machine file for machine udapl
> > 
> > 
> >        	Cannot find MPIRUN machine file for machine udapl
> > 
> >        	+ echo and architecture solaris86 . 
> > 
> >        	and architecture solaris86 .
> > 
> >        	+ [ -n  ] 
> > 
> >        	+ echo (No device specified.) 
> > 
> >        	(No device specified.)
> > 
> >        	+ exit 1 
> > 
> >        	bash-3.00# ls /opt/mvapich/bin
> > 
> >        	mpiCC                 mpiman                mpirun.args
> > mpirun_dbg.ddd        mpirun_dbg.xxgdb
> > 
> >        	mpicc                 mpireconfig           mpirun.vapi
> > mpirun_dbg.gdb        mpirun_rsh
> > 
> >        	mpichversion          mpireconfig.dat
> > mpirun.vapi.args      mpirun_dbg.ladebug    tarch
> > 
> >        	mpicxx                mpirun
> > mpirun_dbg.dbx        mpirun_dbg.totalview  tdevice
> > 
> >        	
> > ________________________________
> > 
> > 
> >        	_______________________________________________
> >        	mvapich-discuss mailing list
> >        	mvapich-discuss at cse.ohio-state.edu
> > 	
> > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> > 
> > 
> 




More information about the mvapich-discuss mailing list