[mvapich-discuss] MPI INIT error

Hoot Thompson hoot at ptpnow.com
Sat Apr 6 11:06:54 EDT 2013


Here's the mpirun_rsh...

[jhthomps at rh64-1-ib ~]$ 
/usr/local/other/utilities/mvapich2/bin/mpirun_rsh -n 2 rh64-1-ib 
rh64-3-ib 
/usr/local/other/utilities/mvapich2/libexec/osu-micro-benchmarks/osu_bw
[cli_0]: aborting job:
Fatal error in MPI_Init:
Other MPI error

[rh64-1-ib:mpispawn_0][child_handler] MPI process (rank: 0, pid: 7781) 
exited with status 1
[rh64-1-ib:mpispawn_0][readline] Unexpected End-Of-File on file 
descriptor 8. MPI process died?
[rh64-1-ib:mpispawn_0][mtpmi_processops] Error while reading PMI socket. 
MPI process died?
[cli_1]: aborting job:
Fatal error in MPI_Init:
Other MPI error

[rh64-3-ib:mpispawn_1][readline] Unexpected End-Of-File on file 
descriptor 7. MPI process died?
[rh64-3-ib:mpispawn_1][mtpmi_processops] Error while reading PMI socket. 
MPI process died?
[rh64-3-ib:mpispawn_1][child_handler] MPI process (rank: 0, pid: 7410) 
exited with status 1





On 04/06/2013 10:18 AM, Devendar Bureddy wrote:
> Hi Hoot
>
> Can you configure MVAPICH2 with the additional flags: 
>  "--enable-fast=none --enable-fast=dbg" to see if it shows better 
> error info than "Other MPI error"?
>
> Can you aslo give it a try with mpirun_rsh?
>
> syntax:    ./mpirun_rsh -n 2  rh64-1-ib rh64-3-ib ./osu_bw
>
> -Devendar
>
>
> On Sat, Apr 6, 2013 at 10:00 AM, Hoot Thompson <hoot at ptpnow.com 
> <mailto:hoot at ptpnow.com>> wrote:
>
>     I've been down this path before and I believe I've taken care of
>     my usual oversights. Here's the background, it's a RHEL6.4 setup
>     using the distro IB modules (not an OFED download). I'm trying to
>     run the micro benchmarks and I'm getting (debug output attached) ....
>
>     =====================================================================================
>     =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>     =   EXIT CODE: 256
>     =   CLEANING UP REMAINING PROCESSES
>     =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>     =====================================================================================
>     [proxy:0:1 at rh64-3-ib] got pmi command (from 4): init
>     pmi_version=1 pmi_subversion=1
>     [proxy:0:1 at rh64-3-ib] PMI response: cmd=response_to_init
>     pmi_version=1 pmi_subversion=1 rc=0
>     [proxy:0:1 at rh64-3-ib] got pmi command (from 4): get_maxes
>
>     [proxy:0:1 at rh64-3-ib] PMI response: cmd=maxes kvsname_max=256
>     keylen_max=64 vallen_max=1024
>     [proxy:0:1 at rh64-3-ib] got pmi command (from 4): get_appnum
>
>     [proxy:0:1 at rh64-3-ib] PMI response: cmd=appnum appnum=0
>     [proxy:0:1 at rh64-3-ib] got pmi command (from 4): get_my_kvsname
>
>     [proxy:0:1 at rh64-3-ib] PMI response: cmd=my_kvsname kvsname=kvs_4129_0
>     [proxy:0:1 at rh64-3-ib] got pmi command (from 4): get_my_kvsname
>
>     [proxy:0:1 at rh64-3-ib] PMI response: cmd=my_kvsname kvsname=kvs_4129_0
>     [proxy:0:1 at rh64-3-ib] got pmi command (from 4): get
>     kvsname=kvs_4129_0 key=PMI_process_mapping
>     [proxy:0:1 at rh64-3-ib] PMI response: cmd=get_result rc=0
>     msg=success value=(vector,(0,2,1))
>     [cli_1]: aborting job:
>     Fatal error in MPI_Init:
>     Other MPI error
>
>
>
>
>     =====================================================================================
>     =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>     =   EXIT CODE: 256
>     =   CLEANING UP REMAINING PROCESSES
>     =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>     =====================================================================================
>
>
>     Here's the output of ulimit on both ends (configured in limits.conf)
>     [jhthomps at rh64-1-ib ~]$  ulimit -l
>     unlimited
>     [root at rh64-3-ib jhthomps]# ulimit -l
>     unlimited
>
>     Firewalls are down and I think the /etc/hosts files are right.
>
>     Suggestions?
>
>     Thanks,
>
>     Hoot
>
>
>
>
>
>     _______________________________________________
>     mvapich-discuss mailing list
>     mvapich-discuss at cse.ohio-state.edu
>     <mailto:mvapich-discuss at cse.ohio-state.edu>
>     http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
>
>
> -- 
> Devendar

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20130406/99fe1f6a/attachment.html


More information about the mvapich-discuss mailing list