[mvapich-discuss] problem with running mvapich

biswajit at crlindia.com biswajit at crlindia.com
Wed Jun 18 01:22:11 EDT 2008


Hi 
 Lei
This is working .
 MVAPICH2 is supposed  to detect the active port automatically , why  is 
it not working ...??





LEI CHAI <chai.15 at osu.edu> 
06/18/2008 03:03 AM

To
biswajit at crlindia.com
cc
mvapich-discuss at cse.ohio-state.edu
Subject
Re: [mvapich-discuss] problem with running mvapich






Hi,
 
MVAPICH2 is supposed to detect the active port automatically for you. 
Could you try the following options:
 
$ mpiexec -n 2 -env MV2_IBA_HCA mthca1 -env MV2_DEFAULT_PORT 1 ./a.out
 
and see if it works for you?
 
Lei


----- Original Message -----
From: biswajit at crlindia.com
Date: Tuesday, June 17, 2008 6:55 am
Subject: [mvapich-discuss] problem with running mvapich
To: mvapich-discuss at cse.ohio-state.edu


> When I ran a simple MPI application with  mvapich2-1.0.2, I got the 
following error messages: 

  
> Unknown Mellanox PCI-Express HCA best guess as Mellanox PCI-Express SDR 
> [3] Abort: Not enough ports are in active stateneeded active ports 1 
>  at line 424 in file rdma_iba_priv.c 
> rank 3 in job 1  n23_32790   caused collective abort of all ranks 
>   exit status of rank 3: return code 252 

> But there is a active port in each node. See the below 'ibstat' output. 


> CA 'mthca0' 
>         CA type: MT25204 
>         Number of ports: 1 
>         Firmware version: 1.1.0 
>         Hardware version: a0 
>         Node GUID: 0x0019bbfffff70cb8 
>         System image GUID: 0x0019bbfffff70cbb 
>         Port 1: 
>                 State: Down 
>                 Physical state: Polling 
>                 Rate: 10 
>                 Base lid: 0 
>                 LMC: 0 
>                 SM lid: 0 
>                 Capability mask: 0x02510a68 
>                 Port GUID: 0x0019bbfffff70cb9 
> CA 'mthca1' 
>         CA type: MT25204 
>         Number of ports: 1 
>         Firmware version: 1.1.0 
>         Hardware version: a0 
>         Node GUID: 0x0019bbfffff7fbe8 
>         System image GUID: 0x0019bbfffff7fbeb 
>         Port 1: 
>                 State: Active 
>                 Physical state: LinkUp 
>                 Rate: 20 
>                 Base lid: 226 
>                 LMC: 0 
>                 SM lid: 117 
>                 Capability mask: 0x02510a68 
>                 Port GUID: 0x0019bbfffff7fbe9 

>  And, whenever I run same job in nodes with IB port 1 active, it works 
properly. 
> Is there any option in MVAPICH to select the IB port which should be 
used ? 

> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20080618/ad048d21/attachment.html


More information about the mvapich-discuss mailing list