[mvapich-discuss] How to keep gid status
Satoshi Isono
isono at cray.com
Wed Jun 24 03:42:50 EDT 2009
Hello Bill,
Thank you for your kindness. I understand I can quote input file name. With command line given below, it was able to bring success.
mpirun_rsh -np 4 -hostfile hostfile '"./a.out ./data"'
Regards,
Satoshi Isono
-----Original Message-----
From: Bill Barth [mailto:bbarth at tacc.utexas.edu]
Sent: Tuesday, June 23, 2009 11:43 PM
To: Satoshi Isono; Dhabaleswar Panda
Cc: mvapich-discuss at cse.ohio-state.edu
Subject: RE: [mvapich-discuss] How to keep gid status
Satoshi,
You can always quote the command. Compare
rusk(2)$ /usr/bin/sg bbarth echo foo
which is broken in the manner you suggest, to
rusk(3)$ /usr/bin/sg bbarth "echo foo"
foo
which does the right thing.
Of course, once you bring quoting into the mix, you'll have to be more careful b/c the user might have put quotes on his command line.
Bill.
--
Bill Barth, Ph.D., Assistant Director, HPC (interim)
bbarth at tacc.utexas.edu | Phone: (512) 232-7069
Office: ROC 1.405 | Fax: (512) 475-9445
> -----Original Message-----
> From: Satoshi Isono [mailto:isono at cray.com]
> Sent: Tuesday, June 23, 2009 3:00 AM
> To: Dhabaleswar Panda; Bill Barth
> Cc: mvapich-discuss at cse.ohio-state.edu
> Subject: RE: [mvapich-discuss] How to keep gid status
>
> Dear Prof. Panda, Dr. Barth,
>
> Thanks for your advices. I have edited mpirun_rsh.c directly in order
> to
> keep the order of options on mpirun_rsh command. I show you
> differential
> lines as below. My MVAPICH2 version is mvapich2-1.2p1.
>
> $ diff mpirun_rsh.c mpirun_rsh.c.org
> 67d66
> < #include <grp.h>
> 260d258
> < struct group *grpptr;
> 275,276d272
> < int sg_index;
> <
> 279d274
> < grpptr = getgrgid(getgid());
> 417,456d411
> < //isono
> < for (i = aout_index; i < argc; i++) {
> < if (strchr(argv[i], '=') == NULL) {
> < sg_index = i;
> < break;
> < }
> < }
> < fprintf(stdout, "\n# INPUT PARAMETERS\n");
> < fprintf(stdout, "%15s = %d\n", "argc", argc);
> < fprintf(stdout, "%15s = %d\n", "option_index", option_index);
> < fprintf(stdout, "%15s = %d\n", "aout_index", aout_index);
> < fprintf(stdout, "%15s = %d\n", "sg_index", sg_index);
> < for (i = 0; i < argc; i++) {
> < fprintf(stdout, "%7sargv[%2d] = %s\n", " ", i, argv[i]);
> < }
> < char add_argv[argc+2][31];
> < for (i = 0; i < argc; i++) {
> < strcpy(add_argv[i], argv[i]);
> < }
> < for (i = 0; i < argc+2; i++) {
> < if (i < sg_index) {
> < argv[i]=add_argv[i];
> < } else if (i == sg_index) {
> < strcpy(argv[i], "/usr/bin/sg");
> < i++;
> < argv[i] = grpptr->gr_name;
> < } else {
> < argv[i]=add_argv[i-2];
> < }
> < }
> < argc = argc + 2;
> < fprintf(stdout, "\n# RUNNING PARAMETERS\n");
> < fprintf(stdout, "%15s = %d\n", "argc", argc);
> < fprintf(stdout, "%15s = %d\n", "option_index", option_index);
> < fprintf(stdout, "%15s = %d\n", "aout_index", aout_index);
> < fprintf(stdout, "%15s = %d\n", "sg_index", sg_index);
> < for (i = 0; i < argc; i++) {
> < fprintf(stdout, "%7sargv[%2d] = %s\n", " ", i, argv[i]);
> < }
> <
>
> And then, this is test result using new mpirun_rsh.
>
> $ mpirun_rsh -np 4 -hostfile hostfile MV2_NUM_HCAS=2
> MV2_SM_SCHEDULING=ROUND_ROBIN ./gid-mv2-itl
>
> # INPUT PARAMETERS
> argc = 8
> option_index = 3
> aout_index = 5
> sg_index = 7
> argv[ 0] = mpirun_rsh
> argv[ 1] = -np
> argv[ 2] = 4
> argv[ 3] = -hostfile
> argv[ 4] = hostfile
> argv[ 5] = MV2_NUM_HCAS=2
> argv[ 6] = MV2_SM_SCHEDULING=ROUND_ROBIN
> argv[ 7] = ./gid-mv2-itl
>
> # RUNNING PARAMETERS
> argc = 10
> option_index = 3
> aout_index = 5
> sg_index = 7
> argv[ 0] = mpirun_rsh
> argv[ 1] = -np
> argv[ 2] = 4
> argv[ 3] = -hostfile
> argv[ 4] = hostfile
> argv[ 5] = MV2_NUM_HCAS=2
> argv[ 6] = MV2_SM_SCHEDULING=ROUND_ROBIN
> argv[ 7] = /usr/bin/sg
> argv[ 8] = GAUSSIAN
> argv[ 9] = ./gid-mv2-itl
>
> However, there is a problem. When we embed /usr/bin/sg command in the
> line of mpirun_rsh, how can we deal with an input file? In case of
> using
> your wrapper script, it also occurs.
>
> I show you a simple example with test code.
>
> 1) mpirun_rsh -np 4 -hostfile hostfile ./gid3 ./data
>
> And the following is gid3 code.
>
> $ cat gid3.c
> #include <stdio.h>
> #include <mpi.h>
> #include <string.h>
> #define MAX_DATA_SIZE 1000000
> double a[MAX_DATA_SIZE];
>
> int main(int argc,char *argv[])
> {
> int rank,size,namelen;
> char name[MPI_MAX_PROCESSOR_NAME],comm[512];
> int i,ret,dsize;
> char str[80];
> FILE *fp;
>
> MPI_Init(&argc,&argv);
>
> MPI_Comm_rank(MPI_COMM_WORLD,&rank);
> MPI_Comm_size(MPI_COMM_WORLD,&size);
> MPI_Get_processor_name(name,&namelen);
>
> //printf("%4d/%-d: %s\n",rank,size,name);
> fp=fopen(argv[1],"r");
> for(i=0;i<MAX_DATA_SIZE;i++){
> if(fgets(str,80,fp)==NULL) break;
> ret=sscanf(str,"%lf",&a[i]);
> fprintf(stdout,"%s_%d: data = %lf\n",name,rank,a[i]);
> }
> fclose(fp);
> dsize=i;
> fprintf(stdout,"n = %d\n",i);
>
> sprintf(comm,"touch testfile_%s_%d",name,rank);
> system(comm);
>
> MPI_Finalize();
> return 0;
> }
>
> When I try to run this code using new mpirun_rsh, I cannot run with
> errors as below.
>
> $ mpirun_rsh -np 4 -hostfile hostfile ./gid3 ./data
>
> # INPUT PARAMETERS
> argc = 7
> option_index = 3
> aout_index = 5
> sg_index = 5
> argv[ 0] = mpirun_rsh
> argv[ 1] = -np
> argv[ 2] = 4
> argv[ 3] = -hostfile
> argv[ 4] = hostfile
> argv[ 5] = ./gid3
> argv[ 6] = ./data
>
> # RUNNING PARAMETERS
> argc = 9
> option_index = 3
> aout_index = 5
> sg_index = 5
> argv[ 0] = mpirun_rsh
> argv[ 1] = -np
> argv[ 2] = 4
> argv[ 3] = -hostfile
> argv[ 4] = hostfile
> argv[ 5] = /usr/bin/sg
> argv[ 6] = GAUSSIAN
> argv[ 7] = ./gid3
> argv[ 8] = ./data
> MPI process terminated unexpectedly
> Exit code -5 signaled from com-0643
> cleanupKilling remote processes...DONE
> Signal 15 received.
>
> In additional information, I was able to run it using other ways
> showing
> below. Both (2) and (3) need the re-editing for users source code.
>
> 2) mpirun_rsh -np 4 -hostfile hostfile ./gid4 < ./data
> 3) mpirun_rsh -np 4 -hostfile hostfile INPUT_FILENAME=data ./gid5
>
> I used the way (2) to specify "stdin" for input file. About (3), source
> gid5.c includes getenv("INPUT_FILENAME") function and I exported this
> environment variable on option line of mpirun_rsh.
>
> Sorry for my long explanation. My question is how do we handle the case
> of (1). Do you have any ideas to solve it? I think it is NOT good to
> modify each users code.
>
> Please let me know some advices.
>
> Best regards,
> Satoshi Isono
>
> -----Original Message-----
> From: Dhabaleswar Panda [mailto:panda at cse.ohio-state.edu]
> Sent: Wednesday, June 17, 2009 4:35 AM
> To: Satoshi Isono
> Cc: mvapich-discuss at cse.ohio-state.edu; Bill Barth; Dhabaleswar Panda
> Subject: Re: [mvapich-discuss] How to keep gid status
>
> Hi,
>
> Thanks for your note. I shared your question with Dr. Bill Barth from
> TACC. Folks from TACC have been using MVAPICH with mpirun_rsh in their
> production environment on Ranger for quite some time. I am including
> his
> reply below. I hope his suggested approach will work for you. Let us
> know.
>
> I am cc'ing Dr. Barth on this e-mail also. If there are any additional
> questions, two of you might exchange additional information on this
> issue.
>
> Thanks,
>
> DK
>
> ====================================================================
>
> As you may recall, we have wrapper scripts that we use on Ranger and
> Lonestar to hide the details of the mpirun_rsh command line from the
> users. We call it 'ibrun'. It interacts with the scheduler (through the
> environment) to generate the host list and establish the number of
> tasks
> to start. I don't see why it would be hard to add a call to /usr/bin/sg
> in
> there.
>
> If the user would have invoked
>
> mpirun_rsh -np 5 -hostfile hosts ./foo
>
> he simply runs
>
> ibrun ./foo
>
> on Ranger or Lonestar. 'ibrun' is basically structured as:
>
> #!/bin/bash
> ....find NP from the envrionment....
> ....find host list....
> $MPICH_HOME/bin/mpirun_rsh -np $NP -hostfile $HOSTFILE "$@"
> So it just takes the command line args of ibrun and passes them
> directly
> to mpirun_rsh
>
> There's no reason it couldn't do
>
> #!/bin/bash
> ....find NP....
> ....find host list....
> GROUP_ID=`id -gn`
> $MPICH_HOME/bin/mpirun_rsh -np $NP -hostfile $HOSTFILE /usr/bin/sg
> $GROUP_ID
> "$@"
>
> It should be this straightforward.
>
> Bill.
>
> ======
>
>
> On Sun, 14 Jun 2009, Satoshi Isono wrote:
>
> > Dear all,
> >
> > I would like to know how to keep gid status when launching MPI
> > processes. We know that, with sg command in mpirun_rsh command line,
> it
> > is successful in this case. Can you please advise me. I show a
> example
> > as below.
> >
> > Most of users belong multiple group. And accounting system is managed
> > based on a group ID (GID). So, all files created from each user must
> be
> > owned with appropriate group owner information.
> >
> > A problem here is that the state of GID not saved. I would show you a
> > example. Could you read it, according to numbers.
> >
> > 1) User logins into a login node.
> >
> > $ id
> > uid=1002(craysp) gid=1002(cray)
> > groups=10(wheel),1002(cray),8001(GAUSSIAN)
> >
> > This is showing default gid is 1002(cray). This "cray" is primary
> group
> > ID.
> >
> > 2) User changes arbitrary group with newgrp command.
> >
> > $ newgrp GAUSSIAN
> > $ id
> > uid=1002(craysp) gid=8001(GAUSSIAN)
> > groups=10(wheel),1002(cray),8001(GAUSSIAN)
> >
> > This case is that a user wants to change another group like
> "GAUSSIAN".
> > Certainly, I make sure it was changed to GAUSSIAN from cray.
> >
> > 3) User runs a MPI job with mpirun_rsh
> >
> > This is the simple MPI code which generates a output file.
> >
> > $ cat gid.c
> > #include <stdio.h>
> > #include <mpi.h>
> > #include <string.h>
> >
> > int main(int argc,char *argv[])
> > {
> > int rank,size,namelen;
> > char name[MPI_MAX_PROCESSOR_NAME],comm[512];
> >
> > MPI_Init(&argc,&argv);
> >
> > MPI_Comm_rank(MPI_COMM_WORLD,&rank);
> > MPI_Comm_size(MPI_COMM_WORLD,&size);
> > MPI_Get_processor_name(name,&namelen);
> >
> > sprintf(comm,"touch testfile_%s_%d",name,rank);
> > system(comm);
> >
> > MPI_Finalize();
> > return 0;
> > }
> >
> > After running this code, I want that a output file was owned by
> > "GAUSSIAN" group. But it was different from that I want. Below is a
> run
> > script including mpirun_rsh.
> >
> > $ cat run_i.sh
> > #!/bin/bash
> > . /opt/Modules/init/bash
> > module load pgi mvapich2/pgi
> > mpirun_rsh -np 1 com-0644 ./gid-mv2
> >
> > 4) User confirms that a created file doesn't owned appropriate group
> ID.
> >
> > $ ls -l testfile_com-0644_0
> > -rw-r--r-- 1 craysp cray 0 Jun 8 17:50 testfile_com-0644_0
> >
> > You can confirm that this file is owned "cray" not "GAUSSIAN". This
> > problem is caused on mpirun_rsh command or SSH server configuration,
> I
> > think.
> >
> > 5) The way to solve it.
> >
> > I am considering that better way is inserting "sg" command just
> before
> > a.out in mpirun_rsh command line. I would show you a example.
> >
> > $ grep mpirun_rsh run_i.sh
> > mpirun_rsh -np 1 com-0644 /usr/bin/sg `id -gn` ./gid-mv2
> >
> > By specifying sg command just before a.out, It works well.
> >
> > $ ls -l testfile_com-0644_0
> > -rw-r--r-- 1 craysp GAUSSIAN 0 Jun 8 18:33 testfile_com-0644_0
> >
> > 6) Request to you
> >
> > I thought that the wrapper script of mpirun_rsh would be created at
> > first. But it is difficult to specify executable file location on
> > command lines. There are various patterns that user describes in
> > mpirun_rsh line. For example:
> >
> > mpirun_rsh -np 2048 -hostfile hosts.txt ./a.out Inputfile | tee -a
> > Outputfile
> > mpirun_rsh -np 256 -hostfile hostlist ./a.out input >> log
> > mpirun_rsh -np 8 -hostfile hostfile MV2_ENABLE_AFFINITY=0
> > MV2_NUM_HCAS=4 ./numarun_mv2.sh ./a.out
> > ...
> >
> > And we can take a look on line 1607.
> >
> > 1607 /* add the arguments */
> > 1608 for (i = aout_index + 1; i < argc; i++) {
> > 1609 strcat(command_name, " ");
> > 1610 strcat(command_name, argv[i]);
> > 1611 }
> >
> > An example of edit:
> >
> > 1607 /* add the arguments */
> > 1608 strcat(command_name, " /usr/bin/sg $(id -gn)");
> > 1609 for (i = aout_index + 1; i < argc; i++) {
> > 1610 strcat(command_name, " ");
> > 1611 strcat(command_name, argv[i]);
> > 1612 }
> >
> > I have edited showing above and done recompile it, but it doesn't
> apply.
> > If you know other way which is able to solve this problem, can you
> > please tell me?
> >
> > Best regards,
> > Satoshi Isono
> >
> >
> > _______________________________________________
> > mvapich-discuss mailing list
> > mvapich-discuss at cse.ohio-state.edu
> > http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >
More information about the mvapich-discuss
mailing list