[mvapich-discuss] MV2_USE_BLOCKING

Alex Breslow abreslow at cs.ucsd.edu
Mon Dec 16 17:44:44 EST 2013

Hi there,

I find that setting MV2_USE_BLOCKING=1  is causing an MPI program that I am
writing to have nondeterministic  behavior.  When run without specifying
this flag, the program runs to completion every time.  However, when the
flag is set, the program only correctly finishes about 10% of the time.

Specifically, I am designing a distributed system that is built on top of
MPI.  The program has a single controller process (rank = 0) and a number
of worker processes.  The controller process cyclically broadcasts
instructions to the worker processes at regular intervals.

I am using MVAPICH2 version 1.9 and PGI compiler version 13.2 on the Gordon

Currently, I use the code below.


int broadcastMessage(int action, int jobID, char* executableName,
  char* jobName){
  MSG_Payload msg;
  msg.action = action; // We always need an action.
  /* Some more stuff that I have omitted for clarity */

  int root = 0;
  int msgCount = 1;

  MPI_Bcast(&msg, msgCount, PayloadType, root, MPI_COMM_WORLD);


int postReceiveBroadcast(){
  // TODO: Encapsulate code up to variable `code' in MPI specific
  //       implementation
  int root = 0;
  int msgCount = 1;
  MSG_Payload msg;
  cout <<"Posting broadcast receive\n";
  MPI_Bcast(&msg, msgCount, PayloadType, root, MPI_COMM_WORLD);

  /* Some more stuff that I have omitted for clarity */

  return ret_code;

The controller managers to get to MPI_Finalize without incident but the
workers don't wake up after their final MPI_Bcast, so they never terminate.
 Workers call MPI_Bcast about 2 minutes before the controller does.

Let me know if I am missing anything or you need more information.

Many thanks,

Alex Breslow
PhD student in computer science at UC San Diego
Email: abreslow at cs.ucsd.edu
Website: cseweb.ucsd.edu/~abreslow
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20131216/4d328ea5/attachment.html>

More information about the mvapich-discuss mailing list