[mvapich-discuss] mpirun runtime problems with ch_gen2

Guido Passet guido.passet at clustervision.com
Sat Apr 15 13:53:30 EDT 2006


Hi Sayantan and list,

On Fri, Apr 14, 2006 at 02:05:05PM -0400, Sayantan Sur wrote:
> 
> * On Apr,2 Sayantan Sur<surs at cse.ohio-state.edu> wrote :
> > Normally, mpirun too should
> > work for running programs over Gen2. We will investigate this issue and
> > get back to you. 
> 
> > * On Apr,1 Guido Passet<guido.passet at clustervision.com> wrote :
> > > I seem to be getting some strange error with mpirun in combination with 
> > > the ch_gen2 device. I am using the OSU MVAPICH 0.9.7 (03/14/06) 
> > > (Integrated MVAPICH tarball with MPICH 1.2.7 + MVICH) Altho i get the 
> > > exact same thing with the latest SVN checkout.
> 
> Thanks for reporting this. There was a file name change required in one
> of the setup files to enable the default `mpirun' to find the correct
> startup program.
> 
> I have checked in the fix; if you do an update on your copy, you should
> be able to use mpirun too.

I already spotted this bug myself and made the changes to the script
involved, however not really with any usefull results..

I just checked out svn version 85 and although the pathissue has been
resolved i still run into a problem. Please see my paste below which is an
attempt to run a simple pingpong program (Intel BenchMark).

$ > mpirun -nolocal -machinefile nodes -np 2 ~/BenchMarks/imb/2.3/IMB-MPI1
PingPong

which is followed by the following output.

[1] Abort: Error creating CQ
 at line 225 in file viainit.c
mpirun: executable version 1 does not match our version 3.
done.

Or when setting a full path to my binary:

[0] Abort: Error creating CQ
 at line 225 in file viainit.c
mpirun: executable version 0 does not match our version 3.
done.

When i try to use mpirun_rsh i run into this:

mpirun_rsh -np 2 node001 node002
~/BenchMarks/imb/2.3/IMB-MPI1 PingPong
[0] Abort: Error creating CQ
 at line 225 in file viainit.c
mpirun: executable version 1 does not match our version 3.
done.

mpirun_rsh -ssh -np 2 node001 node002
/home/guido/BenchMarks/imb/2.3/IMB-MPI1 PingPong
[0] Abort: Error creating CQ
 at line 225 in file viainit.c
mpirun: executable version 0 does not match our version 3.
done.

and similar results with mpirun.ch_gen2..  :(


I have a feeling we run into a bug with viainit.c but i cant
really see it. Any advise would be welcome.


With best regards,
-- 
Guido Passet            Email: guido.passet at clustervision.com

ClusterVision BV        Email support: support at clustervision.com
Nieuw-Zeelandweg 15B    Web: http://www.clustervision.com
1045 AL Amsterdam       Tel: +31 20 407 7550
The Netherlands         Fax: +31 84 759 8389


More information about the mvapich-discuss mailing list