[mvapich-discuss] mpirun_rsh: Unable to get host entry

Jonathan Perkins perkinjo at cse.ohio-state.edu
Thu Jan 5 14:51:31 EST 2012


Mark, thank you for your report and debugging effort.  Can you try
applying the following patch (attached as well) and let us know if it
resolves the problem?  Thanks in advance.

Index: src/pm/mpirun/mpispawn.c
===================================================================
--- src/pm/mpirun/mpispawn.c	(revision 5128)
+++ src/pm/mpirun/mpispawn.c	(working copy)
@@ -181,6 +181,7 @@
 int setup_global_environment()
 {
     char my_host_name[MAX_HOST_LEN + MAX_PORT_LEN];
+    char tmp[MAX_HOST_LEN + 1];

     int i = env2int("MPISPAWN_GENERIC_ENV_COUNT");

@@ -190,13 +191,15 @@
     setenv("MV2_NUM_NODES_IN_JOB", getenv("MPISPAWN_NNODES"), 1);

     /* Ranks now connect to mpispawn */
-    int rv = gethostname(my_host_name, MAX_HOST_LEN);
+    int rv = gethostname(tmp, MAX_HOST_LEN);
+    tmp[MAX_HOST_LEN] = '\0';
+
     if ( rv == -1 ) {
         PRINT_ERROR_ERRNO("gethostname() failed", errno);
         return -1;
     }

-    sprintf(my_host_name, "%s:%d", my_host_name, c_port);
+    sprintf(my_host_name, "%s:%d", tmp, c_port);

     setenv("PMI_PORT", my_host_name, 2);



On Thu, Jan 5, 2012 at 2:16 PM, Mark Debbage <mark.debbage at qlogic.com> wrote:
> I hit the same problem as described here:
>
>  http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/2011-July/003452.html
>
> This appears to be due to the hostname being set to the empty string
> in the PMI_PORT environment variable. I tracked this down using stace,
> and I think this is an MVAPICH2 bug. In this code in ./src/pm/mpirun/mpispawn.c :
>
> void setup_global_environment()
> {
>    char my_host_name[MAX_HOST_LEN + MAX_PORT_LEN];
>
>    int i = env2int("MPISPAWN_GENERIC_ENV_COUNT");
>
>    setenv("MPIRUN_MPD", "0", 1);
>    setenv("MPIRUN_NPROCS", getenv("MPISPAWN_GLOBAL_NPROCS"), 1);
>    setenv("MPIRUN_ID", getenv("MPISPAWN_MPIRUN_ID"), 1);
>    setenv("MV2_NUM_NODES_IN_JOB", getenv("MPISPAWN_NNODES"), 1);
>
>    /* Ranks now connect to mpispawn */
>    gethostname(my_host_name, MAX_HOST_LEN);
>
>    sprintf(my_host_name, "%s:%d", my_host_name, c_port);
>
> The sprintf() writes its result into my_host_name, and gets the %s parameter from
> my_hostname. A sprintf() implementation may well write a nul character into its
> destination before processing its arguments leading to an empty hostname. This
> practice is specifically outlawed in the man page for the glibc sprintf():
>
> DESCRIPTION
>       C99  and  POSIX.1-2001  specify  that  the  results are undefined if a call to sprintf(), snprintf(), vsprintf(), or vsnprintf() would cause to copying to take place between
>       objects that overlap (e.g., if the target string array and one of the supplied input arguments refer to the same buffer).  See NOTES.
>
> NOTES
>       Some programs imprudently rely on code such as the following
>
>           sprintf(buf, "%s some further text", buf);
>
>       to append text to buf.  However, the standards explicitly note that the results are undefined if source and destination buffers overlap when calling  sprintf(),  snprintf(),
>       vsprintf(), and vsnprintf().  Depending on the version of gcc(1) used, and the compiler options employed, calls such as the above will not produce the expected results.
>
>       The glibc implementation of the functions snprintf() and vsnprintf() conforms to the C99 standard, that is, behaves as described above, since glibc version 2.1.  Until glibc
>       2.0.6 they would return -1 when the output was truncated.
>
> Mark.
>
> This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>



-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sprintf.patch
Type: text/x-patch
Size: 917 bytes
Desc: not available
Url : http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20120105/49cafb6e/sprintf-0001.bin


More information about the mvapich-discuss mailing list