[mvapich-discuss] MV2_USE_BLOCKING
Alex Breslow
abreslow at cs.ucsd.edu
Tue Jun 10 15:36:28 EDT 2014
Hi Khaled,
I finally figured out what the problem was. I was not
setting MV2_ON_DEMAND_THRESHOLD high enough. An earlier post to this list (
http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/2009-September.txt)
complained about deadlock as well when this environment variable was not
appropriately set when MV2_USE_BLOCKING=1. My problems went away when I
set MV2_ON_DEMAND_THRESHOLD to greater than or equal to the total number of
MPI tasks present on my chunk of the cluster. Since I was running multiple
MPI programs over a shared set of nodes, this number had to be set to the
combined sum of the tasks across all MPI programs using the MVAPICH2
library.
I am sorry for the delay, and for not responding sooner. It took me until
now to figure out what was going on. By the way does
MV2_ON_DEMAND_THRESHOLD still default to 64?
-Alex
On Tue, Dec 17, 2013 at 9:49 AM, khaled hamidouche <
khaledhamidouche at gmail.com> wrote:
> Hi Alex,
>
> I tried a simple code which broadcasts an integer on our local cluster
> with MV2-1.9 and PGI 13.2 and I'm not able to reproduce your error. I have
> enabled blocking mode and I also oversubscribed the nodes by 2x processes
> as there are cores.
>
> Can you please give us more information :
> 1) How many nodes/cores and process did you use for your run ?
> 2) if possible to share a reproducer?
>
> Thanks.
>
>
> On Mon, Dec 16, 2013 at 5:44 PM, Alex Breslow <abreslow at cs.ucsd.edu>
> wrote:
>
>> Hi there,
>>
>> I find that setting MV2_USE_BLOCKING=1 is causing an MPI program that I
>> am writing to have nondeterministic behavior. When run without specifying
>> this flag, the program runs to completion every time. However, when the
>> flag is set, the program only correctly finishes about 10% of the time.
>>
>> Specifically, I am designing a distributed system that is built on top of
>> MPI. The program has a single controller process (rank = 0) and a number
>> of worker processes. The controller process cyclically broadcasts
>> instructions to the worker processes at regular intervals.
>>
>> I am using MVAPICH2 version 1.9 and PGI compiler version 13.2 on the
>> Gordon Supercomputer.
>>
>> Currently, I use the code below.
>>
>> Controller:
>>
>> int broadcastMessage(int action, int jobID, char* executableName,
>> char* jobName){
>> MSG_Payload msg;
>> msg.action = action; // We always need an action.
>> switch(action){
>> /* Some more stuff that I have omitted for clarity */
>>
>> }
>> int root = 0;
>> int msgCount = 1;
>>
>> MPI_Bcast(&msg, msgCount, PayloadType, root, MPI_COMM_WORLD);
>> }
>>
>> Worker:
>>
>> int postReceiveBroadcast(){
>> // TODO: Encapsulate code up to variable `code' in MPI specific
>> // implementation
>> int root = 0;
>> int msgCount = 1;
>> MSG_Payload msg;
>> cout <<"Posting broadcast receive\n";
>> MPI_Bcast(&msg, msgCount, PayloadType, root, MPI_COMM_WORLD);
>>
>> /* Some more stuff that I have omitted for clarity */
>>
>> return ret_code;
>> }
>>
>> The controller managers to get to MPI_Finalize without incident but the
>> workers don't wake up after their final MPI_Bcast, so they never terminate.
>> Workers call MPI_Bcast about 2 minutes before the controller does.
>>
>> Let me know if I am missing anything or you need more information.
>>
>> Many thanks,
>> Alex
>>
>> --
>> Alex Breslow
>> PhD student in computer science at UC San Diego
>> Email: abreslow at cs.ucsd.edu
>> Website: cseweb.ucsd.edu/~abreslow
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>>
>
>
> --
> K.H
>
--
Alex Breslow
PhD student in computer science at UC San Diego
Email: abreslow at cs.ucsd.edu
Website: cseweb.ucsd.edu/~abreslow
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20140610/f10db8bf/attachment.html>
More information about the mvapich-discuss
mailing list