[mvapich-discuss] 答复: 答复: benchmark osu_bws run failed, on mvapich2-2.0rc1: gethostbyname: Unknown server error
Wang,Yanfei(SYS)
wangyanfei01 at baidu.com
Mon Mar 31 11:05:56 EDT 2014
Hi,
Result:
Mpiexec run fails.
1. mpiexec
[root at bb-nsi-ib04 pt2pt]# mpiexec -n 2 -f hosts_mvapich osu_bw
[proxy:0:1 at bb-nsi-ib04.*com] HYDU_create_process (./utils/launch/launch.c:75): execvp error on file osu_bw (No such file or directory)
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 255
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
[proxy:0:0 at bb-nsi-ib03*.com] HYDU_create_process (./utils/launch/launch.c:75): execvp error on file osu_bw (No such file or directory)
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= EXIT CODE: 255
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
2. mpirun_rsh with RoCE parameter
[root at bb-nsi-ib04 pt2pt]# mpirun_rsh -np 2 --hostfile hosts_mvapich MV2_USE_RoCE=1 osu_latency
gethostbyname: Unknown server error
[bb-nsi-ib04.*com:mpirun_rsh][child_handler] Error in init phase, aborting! (0/2 mpispawn connections)
gethostbyname: Unknown server error
[root at bb-nsi-ib04 pt2pt]#
3. mpirun_rsh
[root at bb-nsi-ib04 pt2pt]# mpirun_rsh -np 2 --hostfile hosts_mvapich osu_latency
gethostbyname: Unknown server error
[bb-nsi-ib04.*com:mpirun_rsh][child_handler] Error in init phase, aborting! (0/2 mpispawn connections)
gethostbyname: Unknown server error
[root at bb-nsi-ib04 pt2pt]#
BR
Thanks
Yanfei
-----邮件原件-----
发件人: Jonathan Perkins [mailto:perkinjo at cse.ohio-state.edu]
发送时间: 2014年3月31日 22:53
收件人: Wang,Yanfei(SYS)
抄送: Jonathan Perkins; mvapich-discuss
主题: Re: 答复: [mvapich-discuss] benchmark osu_bws run failed, on mvapich2-2.0rc1: gethostbyname: Unknown server error
Before debugging further, I would like to know whether the following works for you...
mpiexec -n 2 -f hosts_mvapich osu_bw
On Mon, Mar 31, 2014 at 10:12 AM, Wang,Yanfei(SYS) <wangyanfei01 at baidu.com> wrote:
> Hi,
>
>
>
> Each node in cluster has same /etc/hosts, which is like:
>
> [root at bb-nsi-ib04 pt2pt]# cat /etc/hosts
>
> 192.168.71.3 ib03
>
> 192.168.71.4 ib04
>
> Currently, we have only 2 nodes available in RoCE cluster, IB03 and IB04.
>
>
>
> BR
>
>
>
> Thanks
>
> Yanfei
>
>
>
>
>
> 发件人: Jonathan Perkins [mailto:perkinjo at cse.ohio-state.edu]
> 发送时间: 2014年3月31日 21:40
> 收件人: Wang,Yanfei(SYS)
> 抄送: mvapich-discuss
> 主题: Re: [mvapich-discuss] benchmark osu_bws run failed, on mvapich2-2.0rc1:
> gethostbyname: Unknown server error
>
>
>
> Can you share the contents of the /etc/hosts file from each machine
> including the machine that you launch from?
>
> On Mar 31, 2014 9:33 AM, "Wang,Yanfei(SYS)" <wangyanfei01 at baidu.com> wrote:
>
> Hi,
>
>
>
> I am a fresh learner of MPI, and just try to do some verification on
> mVAPICH2 library on RoCE armed with mvapich2-2.0rc1 on
> MLNX_OFED_LINUX-2.1-1.0.6-rhel6.3-x86_64.
>
>
>
> Could you give me some tips to fix this following issues.
>
>
>
> Configuration:
>
> [root at bb-nsi-ib04 pt2pt]# cat hosts_mvapich
>
> ib03:1
>
> ib04:1
>
> [root at bb-nsi-ib04 pt2pt]# cat /etc/hosts
>
> 192.168.71.3 ib03
>
> 192.168.71.4 ib04
>
>
>
> ERROR:
>
> [root at bb-nsi-ib04 pt2pt]# mpirun_rsh -np 2 --hostfile hosts_mvapich
> osu_bw
>
> gethostbyname: Unknown server error
>
> [bb-nsi-ib04.*.com:mpirun_rsh][child_handler] Error in init phase, aborting!
> (0/2 mpispawn connections)
>
> gethostbyname: Unknown server error
>
> [root at bb-nsi-ib04 pt2pt]#
>
>
>
> It could be caused by wrong configuration. Before on same platform I
> have do verification on OpenMPI with same RoCE configurations and
> similar host configurations.
>
>
>
> Thanks.
>
> -Yanfei
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
--
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo
More information about the mvapich-discuss
mailing list