[mvapich-discuss] mvapich-0.9.8 Bus error

Sayantan Sur surs at cse.ohio-state.edu
Thu Jan 18 17:29:00 EST 2007


Hello Rene,

> I am very new to mvapich and  I am having problems getting it to run.  I
> compiled mvapich-0.9.8 on our cluster which runs the Voltaire version of the
> OFED stack.  To install I simply ran the "make.mvapich.gen2" script after
> pointing it to the /usr/local/ofed dir for IBHOME.

Thanks for reporting this error to the list. Could you verify/answer a
couple of things for us?

1. The version of OFED you compiled MVAPICH with and the one installed
on the machines is the same?

2. What is the application you are running on the 8 nodes? Can you
verify if cpi runs fine on 8 nodes?

Thanks,
Sayantan.

> 
> I can run a 2 or 4 CPU job on two nodes just fine.  The problem happens when
> I try to run a 6 or greater CPU job.  I get a "Bus error" message. Here it
> is running on 2 nodes with 4 CPUs per node.
> 
> rsalmon at login-02-01 177> mpirun_rsh -np 8 -hostfile nodelist.txt ./a.out
> /usr/X11R6/bin/xauth:  error in locking authority file
> /u00/rsalmon/.Xauthority
> /usr/X11R6/bin/xauth:  error in locking authority file
> /u00/rsalmon/.Xauthority
> /usr/X11R6/bin/xauth:  error in locking authority file
> /u00/rsalmon/.Xauthority
> /usr/X11R6/bin/xauth:  error in locking authority file
> /u00/rsalmon/.Xauthority
> /usr/X11R6/bin/xauth:  error in locking authority file
> /u00/rsalmon/.Xauthority
> /usr/X11R6/bin/xauth:  error in locking authority file
> /u00/rsalmon/.Xauthority
> /usr/X11R6/bin/xauth:  error in locking authority file
> /u00/rsalmon/.Xauthority
> Bus error
> Bus error
> Bus error
> Bus error
> Bus error
> Bus error
> Bus error
> Bus error
> 
> 
> rsalmon at login-02-01 178> cat nodelist.txt
> compute-01-01-ib
> compute-01-01-ib
> compute-01-01-ib
> compute-01-01-ib
> compute-01-02-ib
> compute-01-02-ib
> compute-01-02-ib
> compute-01-02-ib
> 
> 
> 
> 
> 
> Any ideas as to what might be wrong?
> Thank you 
> Rene
> 
> 
> 
> 
> 
> 
> 
> -- 
>         Rene Salmon
>         Tulane University
>         Center for Computational Science
>         http://www.ccs.tulane.edu
>         rsalmon at tulane.edu
>         Tel 504-862-8393
>         Fax 504-862-8392
> 
> 
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss

-- 
http://www.cse.ohio-state.edu/~surs


More information about the mvapich-discuss mailing list