[mvapich-discuss] Patch for bug in psminit.c in mvapich-1.2.0

Mike Heinz michael.heinz at qlogic.com
Thu Oct 20 09:19:28 EDT 2011


Some of our testers have been reporting problems when running jobs on fabrics with 8-core processors. One of the other coders (not myself) diagnosed the issue as a timeout during connection and realized the connection timeout value being passed to psm_ep_connect is being set to only 20 nanoseconds. He's provided me with a patch that increases that to 50 seconds. I hope you find this valuable:

*** mpid/ch_psm/psminit.c       Tue Oct 18 15:24:12 2011
--- mpid/ch_psm/psminit.c       Tue Oct 18 15:24:14 2011
***************:
*** 141,147 ****
      psm_uuid_t uuid; /* an array of 16 bytes */
      psm_epid_t my_epid;
      psm_epid_t *epid_list;
!     uint64_t timeout = 20;
      psm_error_t *errors;
      char temp_str[100];
      char temp_str1[100];
--- 141,147 ----
      psm_uuid_t uuid; /* an array of 16 bytes */
      psm_epid_t my_epid;
      psm_epid_t *epid_list;
!     uint64_t timeout = 50 * SEC_IN_NS;
      psm_error_t *errors;
      char temp_str[100];
      char temp_str1[100];

This message and any attached documents contain information from QLogic Corporation or its wholly-owned subsidiaries that may be confidential. If you are not the intended recipient, you may not read, copy, distribute, or use this information. If you have received this transmission in error, please notify the sender immediately by reply e-mail and then delete this message.




More information about the mvapich-discuss mailing list