[mvapich-discuss] mvapich-0.9.6-121 problems

Michael Li mli at deform.com
Fri Feb 3 10:34:16 EST 2006


Hi, Sayantan

Thank you very much for your reply.

I've checked /etc/hosts on all nodes and
found some inconsistency. Our hardware vendor
Microway also pointed to the same problem.

I've changed /etc/hosts and copied to all nodes
and rebooted all nodes.

I've ran my application a few times, at least
I've not encountered the problem yet. Hopefully
it fixed the problems.

 > there is a later version of IBGD available (1.8.1)
 > along with firmware updates for your Mellanox cards.
 > For optimal performance of your cluster, its highly
 > recommended to upgrade to the latest version of IBGD.

Our goal is to achieve the maximum optimal performance of
our cluster. Anything can speed up, we love to do it.
Do you have any numbers to show the speed gain after
upgrade to 1.8.1 from 1.7.0 ?
Is there anyone successfully integrated IBGD 1.8.1 with
mvapich-0.9.6-121 without any problems ?

Best regards.
Michael Li


Sayantan Sur wrote:
> Hi,
> 
> * On Feb,1 Michael Li<mli at deform.com> wrote :
> 
>>Hi,
>>I am new to mvapich.
>>I've downloaded mvapich-0.9.6-121 and built
>>mvapich-0.9.6-121 with PCI-Express and SDR settings
>>based on the spec from our hardware vendor Microway.
>>
>>I made some changes on make.mvapich.vapi and
>>used it to build  mvapich-0.9.6-121, please see
>>attached make.mvapich.vapi.
>>
>>The driver is
>>Mellanox IB Gold Distribution (IBGD) v1.7.0 for Linux
>>which comes with the box.
> 
> 
> Thanks for your interest in MVAPICH! As you pointed out, there is a
> later version of IBGD available (1.8.1) along with firmware updates for
> your Mellanox cards. For optimal performance of your cluster, its highly
> recommended to upgrade to the latest version of IBGD.
> 
> However, from our analysis, that's not the main source of the current
> problem you are facing.
> 
> 
>>I ran my applications, I got problems as follows:
>>
>>(1)
>> $DEFORM3_DIR/mvapich/bin/mpirun -np 2 -hostfile 
>>$DEFORM3_DIR/mvapich/share/mach
>>ines/hosts.list $DEFORM3_DIR/EXE/DEF_SIM_P4P_INFINIBAND.EXE
>>
>>[0] Abort: Null value returned by gethostbyname at line 1549 in file 
>>viainit.c
>>mpirun: executable version 0 does not match our version 3.
> 
> 
> Could you please tell us if the hostnames in `hosts.list' file match
> with the /etc/hosts file on each of the target nodes? In addition, does
> the result of `hostname' command match that on the `hosts.list'?  We are
> suspecting that there might be some networking setup issue like that.
> 
> 
>>(2)
>> $DEFORM3_DIR/mvapich/bin/mpirun -np 5 -hostfile 
>>$DEFORM3_DIR/mvapich/share/machines/hosts.list 
>>$DEFORM3_DIR/EXE/DEF_SIM_P4P_INFINIBAND.EXE
>>
>>gethostbyname: No address associated with name
> 
> 
> This one might be the same problem as the above.
> 
> 
>>(3)
>>$DEFORM3_DIR/mvapich/bin/mpirun -np 6 -hostfile 
>>$DEFORM3_DIR/mvapich/share/machines/hosts.list 
>>$DEFORM3_DIR/EXE/DEF_SIM_P4P_INFINIBAND.EXE
>>
>>Permission denied.
> 
> 
> I'm not sure why you are getting a "Permission Denied" message. It is
> not caused by MVAPICH as such. Could you please the permissions on the
> executable?
> 
> 
>>I knew that IBGD-1.8.1 is available, but I am not sure
>>I should update to it.
> 
> 
> If you can, it'll be the best.
> 
> Thanks,
> Sayantan.
> 
> 

-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-
This email message and any attachments are for the sole use of the
intended recipients and may contain proprietary and/or confidential 
information which may be privileged or otherwise protected from 
disclosure. Any unauthorized review, use, disclosure or distribution is 
prohibited. If you are not the intended recipients, please contact the 
sender by reply email and destroy the original message and any copies of 
the message as well as any attachments to the original message.
-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-



More information about the mvapich-discuss mailing list