[mvapich-discuss] MVAPICH error question

Jonathan Perkins perkinjo at cse.ohio-state.edu
Mon Dec 6 14:09:01 EST 2010


Firstly, I suggest trying mvapich2-1.6rc1 to see if your error still
occurs.  If so, it may be good to isolate the issue to a small snippet
that can reproduce the issue and send that to the list.  It's possible
that there could be some but unrelated to threading as well.

On Mon, Dec 6, 2010 at 1:54 PM, Martin Schatz <mschatz at cs.utexas.edu> wrote:
> Hi Jonathan,
>        Thanks for the documentation clarification, I was a bit confused on that.  Indeed, my program requires the MPI_THREAD_MULTIPLE safety level as calls can occur simultaneously.  I am checking the output of the MPI_Init_thread and making sure that the result is MPI_THREAD_MULTIPLE so I think I am getting the right level.  Originally, I did have the setting to MPI_THREAD_SERIALIZED, but then I tried running my program with the safety level of MPI_THREAD_MULTIPLE and obtained the same error referenced in the previous email.  I will try rebuilding and rerunning again to double-check everything, but assuming the error still occurs, are there other plausible reasons for this behavior?  The version of mvapich2 I am using is version 1.2.  I know that this is a bit outdated, but hopefully it shouldn't be too outdated.  Thanks for all the help.
>
> Martin
>
> On Dec 6, 2010, at 12:23 PM, Jonathan Perkins wrote:
>
>> Hi Martin.
>>
>> Below is a snippet of documentation related to MPI_Init_thread.
>> http://www.mpi-forum.org/docs/mpi22-report/node260.htm#Node260
>>
>> int MPI_Init_thread(int *argc, char *((*argv)[]), int required, int *provided)
>>
>> MPI_THREAD_SINGLE
>> Only one thread will execute.
>>
>> MPI_THREAD_FUNNELED
>> The process may be multi-threaded, but the application must
>> ensure that only the main thread makes MPI calls (for the de nition of
>> main thread,
>> see MPI_IS_THREAD_MAIN on page 386).
>>
>> MPI_THREAD_SERIALIZED
>> The process may be multi-threaded, and multiple threads may
>> make MPI calls, but only one at a time: MPI calls are not made concurrently from
>> two distinct threads (all MPI calls are \serialized").
>>
>> MPI_THREAD_MULTIPLE
>> Multiple threads may call MPI, with no restrictions.
>>
>> When passing in one of the different thread levels, you're telling the
>> MPI implementation what level of support is required based on the way
>> your code is designed.  If you have multiple threads making MPI calls
>> simultaneously you must have MPI_THREAD_MULTIPLE support.  If your
>> threads have locks that ensure that only one thread is making a call
>> at a time, then you can use MPI_THREAD_SERIALIZED support.
>>
>> It sounds like you might need MPI_THREAD_MULTIPLE support since you
>> mentioned that there is no locking going on.  Do you have multiple
>> threads making MPI calls or only one?  Are you checking the return
>> value from MPI_Init_thread (the provided parameter)?  I believe that
>> you may not be getting the threaded level that your app requires.
>> Can you let us know what version of mvapich2 you are using?
>>
>> Let me know if this information helps or if you have any further questions.
>>
>> On Mon, Dec 6, 2010 at 12:12 PM, Martin Schatz <mschatz at cs.utexas.edu> wrote:
>>> Hello,
>>> My name is Martin Schatz and I am currently working on a C++ project that
>>> uses the mvapich2 library.  I have been running into some problems while
>>> running my code and cannot for the life of me figure out what is going on.
>>>  I was hoping that maybe someone on this list could help out.  If this is
>>> the wrong mailing list, please refer me to the correct one.
>>> For background, the project I am working on requires a thread-safe
>>> implementation of MPI as multiple threads (pthreads are used) are making MPI
>>> calls at the same time.  Currently, I have the level of thread safety set to
>>> MPI_THREAD_SERIALIZED, though I have tried the same code with the higher
>>> MPI_THREAD_MULTIPLE safety level (but I do not think the MULTIPLE level is
>>> necessary as I am ok with the calls occurring in some serial fashion).  When
>>> running my program using mpirun with debugging turned on, I see no
>>> indication that something is wrong, however when I run the program in an
>>> large parallel environment (TACC machines), the following error occurs:
>>> Exit code -5 signaled from i182-111.ranger.tacc.utexas.edu
>>> Killing remote processes...current bytes -56, total bytes 0, remote id 11
>>> Assertion failed in file ch3_smp_progress.c at line 2328:
>>> s_current_bytes[vc->smp.local_nodes] == 0
>>> internal ABORT - process 12MPI process terminated unexpectedly
>>> DONE
>>> This error does not occur in a specific place during the execution
>>> (everything seems to be working fine when it occurs).  I was wondering if
>>> anyone on this list could explain what this error means and perhaps some
>>> tips on tracking down the bug.  Also, I am not using explicit locks in the
>>> program around the MPI calls.  I thought that by specifying the level of
>>> thread safety in MPI_thread_init, it would be taken care of for me.  Is this
>>> in error?  Thanks for any guidance you can provide.
>>>
>>> Martin Schatz
>>> _______________________________________________
>>> mvapich-discuss mailing list
>>> mvapich-discuss at cse.ohio-state.edu
>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>
>>>
>>
>>
>>
>> --
>> Jonathan Perkins
>> http://www.cse.ohio-state.edu/~perkinjo
>
>
>



-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo



More information about the mvapich-discuss mailing list