[mvapich-discuss] MVAPICH error question

Jonathan Perkins perkinjo at cse.ohio-state.edu
Mon Dec 6 15:25:41 EST 2010


Great to hear!

On Mon, Dec 6, 2010 at 2:38 PM, Martin Schatz <mschatz at cs.utexas.edu> wrote:
> Thanks for your help, I think my Makefiles didn't notice when I updated the change to MPI_THREAD_MULTIPLE.  Things seem to be running smoothely now, thanks for all your help.
>
> Martin
>
> On Dec 6, 2010, at 1:09 PM, Jonathan Perkins wrote:
>
>> Firstly, I suggest trying mvapich2-1.6rc1 to see if your error still
>> occurs.  If so, it may be good to isolate the issue to a small snippet
>> that can reproduce the issue and send that to the list.  It's possible
>> that there could be some but unrelated to threading as well.
>>
>> On Mon, Dec 6, 2010 at 1:54 PM, Martin Schatz <mschatz at cs.utexas.edu> wrote:
>>> Hi Jonathan,
>>>        Thanks for the documentation clarification, I was a bit confused on that.  Indeed, my program requires the MPI_THREAD_MULTIPLE safety level as calls can occur simultaneously.  I am checking the output of the MPI_Init_thread and making sure that the result is MPI_THREAD_MULTIPLE so I think I am getting the right level.  Originally, I did have the setting to MPI_THREAD_SERIALIZED, but then I tried running my program with the safety level of MPI_THREAD_MULTIPLE and obtained the same error referenced in the previous email.  I will try rebuilding and rerunning again to double-check everything, but assuming the error still occurs, are there other plausible reasons for this behavior?  The version of mvapich2 I am using is version 1.2.  I know that this is a bit outdated, but hopefully it shouldn't be too outdated.  Thanks for all the help.
>>>
>>> Martin
>>>
>>> On Dec 6, 2010, at 12:23 PM, Jonathan Perkins wrote:
>>>
>>>> Hi Martin.
>>>>
>>>> Below is a snippet of documentation related to MPI_Init_thread.
>>>> http://www.mpi-forum.org/docs/mpi22-report/node260.htm#Node260
>>>>
>>>> int MPI_Init_thread(int *argc, char *((*argv)[]), int required, int *provided)
>>>>
>>>> MPI_THREAD_SINGLE
>>>> Only one thread will execute.
>>>>
>>>> MPI_THREAD_FUNNELED
>>>> The process may be multi-threaded, but the application must
>>>> ensure that only the main thread makes MPI calls (for the de nition of
>>>> main thread,
>>>> see MPI_IS_THREAD_MAIN on page 386).
>>>>
>>>> MPI_THREAD_SERIALIZED
>>>> The process may be multi-threaded, and multiple threads may
>>>> make MPI calls, but only one at a time: MPI calls are not made concurrently from
>>>> two distinct threads (all MPI calls are \serialized").
>>>>
>>>> MPI_THREAD_MULTIPLE
>>>> Multiple threads may call MPI, with no restrictions.
>>>>
>>>> When passing in one of the different thread levels, you're telling the
>>>> MPI implementation what level of support is required based on the way
>>>> your code is designed.  If you have multiple threads making MPI calls
>>>> simultaneously you must have MPI_THREAD_MULTIPLE support.  If your
>>>> threads have locks that ensure that only one thread is making a call
>>>> at a time, then you can use MPI_THREAD_SERIALIZED support.
>>>>
>>>> It sounds like you might need MPI_THREAD_MULTIPLE support since you
>>>> mentioned that there is no locking going on.  Do you have multiple
>>>> threads making MPI calls or only one?  Are you checking the return
>>>> value from MPI_Init_thread (the provided parameter)?  I believe that
>>>> you may not be getting the threaded level that your app requires.
>>>> Can you let us know what version of mvapich2 you are using?
>>>>
>>>> Let me know if this information helps or if you have any further questions.
>>>>
>>>> On Mon, Dec 6, 2010 at 12:12 PM, Martin Schatz <mschatz at cs.utexas.edu> wrote:
>>>>> Hello,
>>>>> My name is Martin Schatz and I am currently working on a C++ project that
>>>>> uses the mvapich2 library.  I have been running into some problems while
>>>>> running my code and cannot for the life of me figure out what is going on.
>>>>>  I was hoping that maybe someone on this list could help out.  If this is
>>>>> the wrong mailing list, please refer me to the correct one.
>>>>> For background, the project I am working on requires a thread-safe
>>>>> implementation of MPI as multiple threads (pthreads are used) are making MPI
>>>>> calls at the same time.  Currently, I have the level of thread safety set to
>>>>> MPI_THREAD_SERIALIZED, though I have tried the same code with the higher
>>>>> MPI_THREAD_MULTIPLE safety level (but I do not think the MULTIPLE level is
>>>>> necessary as I am ok with the calls occurring in some serial fashion).  When
>>>>> running my program using mpirun with debugging turned on, I see no
>>>>> indication that something is wrong, however when I run the program in an
>>>>> large parallel environment (TACC machines), the following error occurs:
>>>>> Exit code -5 signaled from i182-111.ranger.tacc.utexas.edu
>>>>> Killing remote processes...current bytes -56, total bytes 0, remote id 11
>>>>> Assertion failed in file ch3_smp_progress.c at line 2328:
>>>>> s_current_bytes[vc->smp.local_nodes] == 0
>>>>> internal ABORT - process 12MPI process terminated unexpectedly
>>>>> DONE
>>>>> This error does not occur in a specific place during the execution
>>>>> (everything seems to be working fine when it occurs).  I was wondering if
>>>>> anyone on this list could explain what this error means and perhaps some
>>>>> tips on tracking down the bug.  Also, I am not using explicit locks in the
>>>>> program around the MPI calls.  I thought that by specifying the level of
>>>>> thread safety in MPI_thread_init, it would be taken care of for me.  Is this
>>>>> in error?  Thanks for any guidance you can provide.
>>>>>
>>>>> Martin Schatz
>>>>> _______________________________________________
>>>>> mvapich-discuss mailing list
>>>>> mvapich-discuss at cse.ohio-state.edu
>>>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Jonathan Perkins
>>>> http://www.cse.ohio-state.edu/~perkinjo
>>>
>>>
>>>
>>
>>
>>
>> --
>> Jonathan Perkins
>> http://www.cse.ohio-state.edu/~perkinjo
>
>
>



-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo



More information about the mvapich-discuss mailing list