[mvapich-discuss] OSC's mpiexec local rank

Doug Johnson djohnson at osc.edu
Sun Mar 24 18:12:12 EDT 2013


Hi Brody,

This looks normal to me, it just appears the scheduler allocates nodes
in reverse order (default order for the Moab scheduler.)  If you
wanted explicit MPI rank assignment to nodes you could use a config
file with mpiexec.  To reverse the order you're seeing on the NICS
system,

   sort $PBS_NODEFILE | sed 's/\(.*\)/\1 : \.\/myexe/' > config
   mpiexec -config config

The order of entries in the mpiexec config file will correspond to MPI
rank order.  See the mpiexec man page for more details on config file
syntax.  If there are other issues, we should probably move this to
the mpiexec mailing list, mpiexec at osc.edu.

Best,
Doug


At Sun, 24 Mar 2013 11:17:55 -0700,
Brody Huval wrote:
> 
> [1  <text/plain; us-ascii (quoted-printable)>]
> 
> [2  <text/html; us-ascii (quoted-printable)>]
> Hi Doug,
> 
> Thanks for the reply. I believe I have used PMI_ID before and had it work. However, I just switched to
> a different cluster and am now seeing this behavior:
> 
> Hello World from process 0 with PMI_ID=0 running on kid080.nics.utk.edu
> Hello World from process 1 with PMI_ID=1 running on kid080.nics.utk.edu
> Hello World from process 9 with PMI_ID=9 running on kid076.nics.utk.edu
> Hello World from process 2 with PMI_ID=2 running on kid080.nics.utk.edu
> Hello World from process 10 with PMI_ID=10 running on kid076.nics.utk.edu
> Hello World from process 11 with PMI_ID=11 running on kid076.nics.utk.edu
> Hello World from process 3 with PMI_ID=3 running on kid078.nics.utk.edu
> Hello World from process 6 with PMI_ID=6 running on kid077.nics.utk.edu
> Hello World from process 4 with PMI_ID=4 running on kid078.nics.utk.edu
> Hello World from process 7 with PMI_ID=7 running on kid077.nics.utk.edu
> Hello World from process 5 with PMI_ID=5 running on kid078.nics.utk.edu
> Hello World from process 8 with PMI_ID=8 running on kid077.nics.utk.edu
> 
> From this simple program:
> 
> int main(int argc, char *argv[]) {
>         int rank, ntasks, length, local_rank;
>         char hostname[80];             /* Message array */
> local_rank = atoi(getenv("PMI_ID"));
> MPI_Init(&argc, &argv); /* Initialize MPI */
> MPI_Comm_size(MPI_COMM_WORLD, &ntasks);/* Get nr of tasks */
>   MPI_Comm_rank(MPI_COMM_WORLD, &rank);    /* Get id of this process */
>         MPI_Get_processor_name(hostname, &length);  /* Get name of this processor */
>         printf("Hello World from process %d with PMI_ID=%d running on %s\n", rank,
> local_rank,hostname);
> MPI_Finalize();         /* Terminate MPI */
>         exit(0);
> }
> 
> I am just using "mpiexec <program>" within a submit script. Any idea why this would happen? Thanks for
> your time.
> 
> Best,
> Brody
> 
> On Mar 24, 2013, at 4:58 AM, Doug Johnson <djohnson at osc.edu> wrote:
> 
>     At Sat, 23 Mar 2013 23:26:36 -0700,
>     Brody Huval wrote:
> 
>         Hi,
>        
>         Is there an equivalent to the MV2_COMM_WORLD_LOCAL_RANK when using OSC's mpiexec with PBS/
>         torque? If not, anyone have any suggestions or code to get one?
> 
>     Hi Brody,
>    
>     The OSC mpiexec PMI_ID environment variable is equivalent.
>    
>     Best,
>     Doug
> 
> 


More information about the mvapich-discuss mailing list