[mvapich-discuss] Solaris x86

LEI CHAI chai.15 at osu.edu
Tue Apr 18 13:20:26 EDT 2006


Michael,

We also have Solaris/X86 in our lab, and we have tested MVAPICH on Solaris and didn't have this problem. For the time being, could you use $COMPILE_PATH/bin/mpirun_rsh instead of $INSTALL/bin/mpirun_rsh? $COMPILE_PATH is the directory of MVAPICH source code:

$COMPILE_PATH/bin/mpirun_rsh -np 2 node1 node2 DAPL_PROVIDER=ibd0 ./cpi

If you still see the "cannot open IA" problem, could you apply the patch below and let us know the output? The patch is just to print out the IAname.

Thanks.
Lei

-------------------------------------------------
--- viainit.c.orig      Tue Apr 18 13:04:10 2006
+++ viainit.c.new       Tue Apr 18 13:05:26 2006
@@ -211,7 +211,7 @@
                        &async_evd_handle, &viadev.nic);
     if (ret != DAT_SUCCESS)
       {
-          udapl_error_abort (GEN_EXIT_ERR, "cannot open IA");
+          udapl_error_abort (GEN_EXIT_ERR, "cannot open IA: %s", dapl_provider);
       }

     viadev.maxtransfersize = viadev_max_rdma_size;

-----------------------------------------------

----- Original Message -----
From: "Di Domenico, Michael" <mdidomenico at silverstorm.com>
Date: Tuesday, April 18, 2006 12:29 pm
Subject: RE: RE: RE: [mvapich-discuss] Solaris x86

> Lei,
> 
> Something must not be moved correctly during the install process of 
> themake script and is corrupting the executable...  more then 
> likely I
> would personally suspect is that a tool your using to move the 
> files is
> different on solaris then it is on linux....
> 
> I've also added DAPL_PROVIDER to the ~/.bashrc and ~/.profile 
> files.  If
> I ssh from one machine to another it does get set, as evidenced by 
> echo$DAPL_PROVIDER...
> 
> 
> ...output truncated....
> installed MPICH in /opt/mvapich
> /opt/mvapich/sbin/mpiuninstall may be used to remove the installation.
> Congratulations on successfully building MVAPICH. Please send your
> feedback to mvapich-help at cse.ohio
> -state.edu.
> bash-3.00# /opt/mvapich/bin/mpirun_rsh
> bash: /opt/mvapich/bin/mpirun_rsh: Invalid argument
> bash-3.00# file /opt/mvapich/bin/mpirun_rsh
> can't read ELF header
> /opt/mvapich/bin/mpirun_rsh:
> bash-3.00#
> 
> -----Original Message-----
> From: LEI CHAI [chai.15 at osu.edu] 
> Sent: Tuesday, April 18, 2006 12:01 PM
> To: Di Domenico, Michael
> Cc: mvapich-discuss at cse.ohio-state.edu
> Subject: Re: RE: RE: [mvapich-discuss] Solaris x86
> 
> Michael,
> 
> One small thing, please make sure to export DAPL_PROVIDER in the 
> .bashrcfile instead of export it in the current shell. Export it in 
> the current
> shell does not help and we are taking a look at it.
> 
> Also, we do not understand why you need to copy mpirun_rsh from
> mpid/udapl/process. If you run mvapich-0.9.7/make.mvapich.udapl to
> rebuild mvapich , mpirun_rsh should be generated automatically in your
> $INSTALL/bin directory. Could you just run $INSTALL/bin/mpirun_rsh
> without any argument and let us know the result?
> 
> Thanks.
> Lei
> 
> 
> ----- Original Message -----
> From: "Di Domenico, Michael" <mdidomenico at silverstorm.com>
> Date: Tuesday, April 18, 2006 11:20 am
> Subject: RE: RE: [mvapich-discuss] Solaris x86
> 
> > Lei,
> > 
> > I had a feeling you were going to say that... See my outputs 
> below. 
> > The
> > IB card is definitely up, it's detected successfully by the 
> kernel 
> > and I
> > can run mvapich using IP over IB with no issues...
> > 
> > bash-3.00# tail /etc/dat/dat.conf
> > ....output truncated....
> > # IAname version threadsafe default library-path provider-version \
> > #       instance-data platform-information
> > #
> > ibd0  u1.2  nonthreadsafe  default  udapl_tavor.so.1  SUNW.1.0  " "
> > "driver_name=tavor"
> > 
> > bash-3.00# ifconfig ibd0
> > ibd0: flags=1000843<UP,BROADCAST,RUNNING,MULTICAST,IPv4> mtu 2044 
> > index3
> >        inet 192.168.101.41 netmask ffffff00 broadcast 
> 192.168.101.255>        ipib 
> 0:0:4:6:fe:80:0:0:0:0:0:0:0:6:6a:0:a0:0:3:a1> 
> > bash-3.00# ping 192.168.101.41
> > 192.168.101.41 is alive (tse41-ib)
> > bash-3.00# ping 192.168.101.42
> > 192.168.101.42 is alive (tse42-ib)
> > 
> > bash-3.00# echo $DAPL_PROVIDER
> > ibd0
> > bash-3.00#
> > 
> > -----Original Message-----
> > From: LEI CHAI [chai.15 at osu.edu] 
> > Sent: Tuesday, April 18, 2006 11:13 AM
> > To: Di Domenico, Michael
> > Cc: mvapich-discuss at cse.ohio-state.edu
> > Subject: Re: RE: [mvapich-discuss] Solaris x86
> > 
> > Michael,
> > 
> > There are several possible reasons that you see this error:
> > 
> > 1. There is no valid entry in /etc/dat/dat.conf
> > 
> > 2. There is no "export DAPL_PROVIDER=ibd0" in your .bashrc file, or
> > "source ~/.bashrc" was not done if you were already in the shell.
> > 
> > 3. InfiniBand on the node is not working properly.
> > 
> > I guess you have taken care of 1 and 2. For 3, could you do a 
> > "ifconfigibd0" and let us know the output?
> > 
> > Thanks.
> > Lei
> > 
> > 
> > ----- Original Message -----
> > From: "Di Domenico, Michael" <mdidomenico at silverstorm.com>
> > Date: Tuesday, April 18, 2006 9:24 am
> > Subject: RE: [mvapich-discuss] Solaris x86
> > 
> > > Lei,
> > > 
> > > 
> > > 
> > > I copied mpirun_rsh from the mpid/udal/process directory which 
> > > seems to
> > > be a valid executable, and now I get
> > > 
> > > 
> > > 
> > > bash-3.00# ./mpirun -hostfile ../share/machines.udapl ./cpi
> > > 
> > > [0] Abort: cannot open IA at line 214 in file viainit.c
> > > 
> > > mpirun: executable version 0 does not match our version 3.
> > > 
> > > done.
> > > 
> > > 
> > > 
> > > ________________________________
> > > 
> > > From: lei chai [chai.15 at osu.edu] 
> > > Sent: Monday, April 17, 2006 10:14 PM
> > > To: Di Domenico, Michael; mvapich-discuss at cse.ohio-state.edu
> > > Subject: Re: [mvapich-discuss] Solaris x86
> > > 
> > > 
> > > 
> > > Michael,
> > > 
> > > 
> > > 
> > > Thanks for reporting the mpirun problem. We have now fixed it. 
> > > Please go
> > > to your mvapich-0.9.7/mpid/udapl directory, change the names of 
> the> > files mpirun.vapi.args and mpirun.vapi.in to 
> mpirun.udapl.args and
> > > mpirun.udapl.in. Then replace "vapi" in
> > > mvapich-0.9.7/mpid/udapl/mpirun.lst to "udapl". You also need 
> to add
> > > "export DAPL_PROVIDER=ibd0" to your .bashrc file. After 
> rebuild, you
> > > could run a program:
> > > 
> > > 
> > > 
> > > mpirun -n 2 -machinefile my-machine-file ./cpi
> > > 
> > > 
> > > 
> > > where my-machine-file contains host names.
> > > 
> > > 
> > > 
> > > We have never had problem with mpirun_rsh before. Please follow 
> > Matt's> suggestion and let us know the result.
> > > 
> > > 
> > > 
> > > Thanks.
> > > 
> > > Lei
> > > 
> > > 
> > > 
> > > 	----- Original Message ----- 
> > > 
> > > 	From: Di Domenico, Michael <')" >mdidomenico at silverstorm.com>
> > > 
> > > 
> > > 	To: lei chai <')" >chai.15 at osu.edu>  ;
> > > mvapich-discuss at cse.ohio-state.edu 
> > > 
> > > 	Sent: Monday, April 17, 2006 5:00 PM
> > > 
> > > 	Subject: RE: [mvapich-discuss] Solaris x86
> > > 
> > >         
> > > 
> > > 	Lei,
> > > 
> > >         
> > > 
> > > 	Thanks for the reply, but it still doesn't work...
> > > 
> > >         
> > > 
> > > 	--- first try with mpirun_rsh 
> > > 
> > >         
> > > 
> > > 	bash-3.00# /opt/mvapich/bin/mpirun_rsh -np 2 tse41-ib tse42-ib
> > > DAPL_PROVIDER="ibd0" ./cpi
> > > 
> > > 	bash: /opt/mvapich/bin/mpirun_rsh: Invalid argument
> > > 
> > >         
> > > 
> > > 	--- second try with mpirun (just to see what happens)
> > > 
> > >         
> > > 
> > > 	bash-3.00# /opt/mvapich/bin/mpirun -np 2 tse41-ib tse42-ib
> > > DAPL_PROVIDER="ibd0" ./cpi
> > > 
> > > 	Warning: Command line arguments for program should be given
> > > 
> > > 	after the program name.  Assuming that tse42-ib is a
> > > 
> > > 	command line argument for the program.
> > > 
> > > 	Warning: Command line arguments for program should be given
> > > 
> > > 	after the program name.  Assuming that DAPL_PROVIDER=ibd0 is a
> > > 
> > > 	command line argument for the program.
> > > 
> > > 	Unrecognized argument tse41-ib ignored.
> > > 
> > > 	Cannot find MPIRUN machine file for machine udapl
> > > 
> > > 	and architecture solaris86 .
> > > 
> > > 	(No device specified.)
> > > 
> > > 	bash-3.00#
> > > 
> > >         
> > > 
> > > 	
> > > ________________________________
> > > 
> > > 
> > > 	From: lei chai [chai.15 at osu.edu] 
> > > 	Sent: Monday, April 17, 2006 4:50 PM
> > > 	To: Di Domenico, Michael; mvapich-discuss at cse.ohio-state.edu
> > > 	Subject: Re: [mvapich-discuss] Solaris x86
> > > 
> > >         
> > > 
> > > 	Hi,
> > > 
> > >         
> > > 
> > > 	Thank you for trying out MVAPICH-0.9.7. Please use mpirun_rsh
> > > instead of mpirun. And for using the uDAPL device, please 
> specify an
> > > IAname, e.g.
> > > 
> > >         
> > > 
> > > 	/opt/mvapich/bin/mpirun_rsh -np 2 node1 node2
> > > DAPL_PROVIDER="IAname" ./cpi
> > > 
> > >         
> > > 
> > > 	The IAname can be found in /etc/dat/dat.conf, it is the first
> > > field.
> > > 
> > >         
> > > 
> > > 	Hope this helps.
> > > 
> > >         
> > > 
> > > 	Regards,
> > > 
> > > 	Lei
> > > 
> > >         
> > > 
> > >        	----- Original Message ----- 
> > > 
> > >        	From: Di Domenico, Michael
> > > <')" >mdidomenico at silverstorm.com>  
> > > 
> > >        	To: mvapich-discuss at cse.ohio-state.edu 
> > > 
> > >        	Sent: Monday, April 17, 2006 4:06 PM
> > > 
> > >        	Subject: [mvapich-discuss] Solaris x86
> > > 
> > >                 
> > > 
> > >        	I'm trying to get Mvapich 0.9.7 to compile and run on
> > > Solaris 10 1/06 x86 using the GNU toolset downloaded from
> > > sunfreeware.com...
> > > 
> > >                 
> > > 
> > >        	I'm attaching the outputs from ./make.mvapich.udapl.
> > > 
> > >                 
> > > 
> > >        	Everything seems to compile, but I don't ever seem to
> > > get a mpirun.udapl file...  Any clue's that I missed from the make
> > > outputs?
> > > 
> > >                 
> > > 
> > >        	bash-3.00# cd /opt/mvapich/examples/
> > > 
> > >        	bash-3.00# ls
> > > 
> > >        	cpi          cpi.o        cpip.c       Makefile
> > > MPI-2-C++    README
> > > 
> > >        	cpi.c        cpilog.c     hello++.cc   Makefile.in
> > > mpirun       simpleio.c
> > > 
> > >        	bash-3.00# ./mpirun ./cpi
> > > 
> > >        	Cannot find MPIRUN machine file for machine udapl
> > > 
> > >        	and architecture solaris86 .
> > > 
> > >        	(No device specified.)
> > > 
> > >        	bash-3.00# sh -x ./mpirun ./cpi
> > > 
> > >        	....output truncated....
> > > 
> > >        	+ [ -x /opt/mvapich/bin/mpirun.udapl ] 
> > > 
> > >        	+ echo Cannot find MPIRUN machine file for machine udapl
> > > 
> > > 
> > >        	Cannot find MPIRUN machine file for machine udapl
> > > 
> > >        	+ echo and architecture solaris86 . 
> > > 
> > >        	and architecture solaris86 .
> > > 
> > >        	+ [ -n  ] 
> > > 
> > >        	+ echo (No device specified.) 
> > > 
> > >        	(No device specified.)
> > > 
> > >        	+ exit 1 
> > > 
> > >        	bash-3.00# ls /opt/mvapich/bin
> > > 
> > >        	mpiCC                 mpiman                mpirun.args
> > > mpirun_dbg.ddd        mpirun_dbg.xxgdb
> > > 
> > >        	mpicc                 mpireconfig           mpirun.vapi
> > > mpirun_dbg.gdb        mpirun_rsh
> > > 
> > >        	mpichversion          mpireconfig.dat
> > > mpirun.vapi.args      mpirun_dbg.ladebug    tarch
> > > 
> > >        	mpicxx                mpirun
> > > mpirun_dbg.dbx        mpirun_dbg.totalview  tdevice
> > > 
> > >        	
> > > ________________________________
> > > 
> > > 
> > >        	_______________________________________________
> > >        	mvapich-discuss mailing list
> > >        	mvapich-discuss at cse.ohio-state.edu
> > > 	
> > > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> > > 
> > > 
> > 
> 
> 



More information about the mvapich-discuss mailing list