[mvapich-discuss] mvapich-0.9.6-121 problems

Sayantan Sur surs at cse.ohio-state.edu
Fri Feb 3 13:27:15 EST 2006


Hello Michael,

* On Feb,3 Michael Li<mli at deform.com> wrote :
> I've ran my application a few times, at least
> I've not encountered the problem yet. Hopefully
> it fixed the problems.

Very glad to know that you are now able to run MVAPICH successfully.

> Our goal is to achieve the maximum optimal performance of
> our cluster. Anything can speed up, we love to do it.
> Do you have any numbers to show the speed gain after
> upgrade to 1.8.1 from 1.7.0 ?

Since IBGD-1.7.0 is a little back-dated, we no longer have this
installed on our systems. However, I do recall that due to software
driver + firmware updates, the latency of short messages was reduced by
about 1.2 microsecs. So, if your applications are latency sensitive, you
should be able to see better performance.

Another motivation for the upgrade is bugfixes by Mellanox to software
and firmware. You can take a look at the document titled
"IBGD_release_notes_1_8_1.pdf" in the latest IBGD release.

> Is there anyone successfully integrated IBGD 1.8.1 with
> mvapich-0.9.6-121 without any problems ?

May I ask you to upgrade to mvapich-0.9.6-122? I installed IBGD 1.8.1
and compiled/ran the osu benchmarks. There were no issues.

Thanks,
Sayantan.

> 
> Best regards.
> Michael Li
> 
> 
> Sayantan Sur wrote:
> >Hi,
> >
> >* On Feb,1 Michael Li<mli at deform.com> wrote :
> >
> >>Hi,
> >>I am new to mvapich.
> >>I've downloaded mvapich-0.9.6-121 and built
> >>mvapich-0.9.6-121 with PCI-Express and SDR settings
> >>based on the spec from our hardware vendor Microway.
> >>
> >>I made some changes on make.mvapich.vapi and
> >>used it to build  mvapich-0.9.6-121, please see
> >>attached make.mvapich.vapi.
> >>
> >>The driver is
> >>Mellanox IB Gold Distribution (IBGD) v1.7.0 for Linux
> >>which comes with the box.
> >
> >
> >Thanks for your interest in MVAPICH! As you pointed out, there is a
> >later version of IBGD available (1.8.1) along with firmware updates for
> >your Mellanox cards. For optimal performance of your cluster, its highly
> >recommended to upgrade to the latest version of IBGD.
> >
> >However, from our analysis, that's not the main source of the current
> >problem you are facing.
> >
> >
> >>I ran my applications, I got problems as follows:
> >>
> >>(1)
> >>$DEFORM3_DIR/mvapich/bin/mpirun -np 2 -hostfile 
> >>$DEFORM3_DIR/mvapich/share/mach
> >>ines/hosts.list $DEFORM3_DIR/EXE/DEF_SIM_P4P_INFINIBAND.EXE
> >>
> >>[0] Abort: Null value returned by gethostbyname at line 1549 in file 
> >>viainit.c
> >>mpirun: executable version 0 does not match our version 3.
> >
> >
> >Could you please tell us if the hostnames in `hosts.list' file match
> >with the /etc/hosts file on each of the target nodes? In addition, does
> >the result of `hostname' command match that on the `hosts.list'?  We are
> >suspecting that there might be some networking setup issue like that.
> >
> >
> >>(2)
> >>$DEFORM3_DIR/mvapich/bin/mpirun -np 5 -hostfile 
> >>$DEFORM3_DIR/mvapich/share/machines/hosts.list 
> >>$DEFORM3_DIR/EXE/DEF_SIM_P4P_INFINIBAND.EXE
> >>
> >>gethostbyname: No address associated with name
> >
> >
> >This one might be the same problem as the above.
> >
> >
> >>(3)
> >>$DEFORM3_DIR/mvapich/bin/mpirun -np 6 -hostfile 
> >>$DEFORM3_DIR/mvapich/share/machines/hosts.list 
> >>$DEFORM3_DIR/EXE/DEF_SIM_P4P_INFINIBAND.EXE
> >>
> >>Permission denied.
> >
> >
> >I'm not sure why you are getting a "Permission Denied" message. It is
> >not caused by MVAPICH as such. Could you please the permissions on the
> >executable?
> >
> >
> >>I knew that IBGD-1.8.1 is available, but I am not sure
> >>I should update to it.
> >
> >
> >If you can, it'll be the best.
> >
> >Thanks,
> >Sayantan.
> >
> >
> 
> -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
> This email message and any attachments are for the sole use of the
> intended recipients and may contain proprietary and/or confidential 
> information which may be privileged or otherwise protected from 
> disclosure. Any unauthorized review, use, disclosure or distribution is 
> prohibited. If you are not the intended recipients, please contact the 
> sender by reply email and destroy the original message and any copies of 
> the message as well as any attachments to the original message.
> -+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-

-- 
http://www.cse.ohio-state.edu/~surs


More information about the mvapich-discuss mailing list