[mvapich-discuss] MVAPICH error question

Martin Schatz mschatz at cs.utexas.edu
Mon Dec 6 14:38:03 EST 2010


Thanks for your help, I think my Makefiles didn't notice when I updated the change to MPI_THREAD_MULTIPLE.  Things seem to be running smoothely now, thanks for all your help.

Martin

On Dec 6, 2010, at 1:09 PM, Jonathan Perkins wrote:

> Firstly, I suggest trying mvapich2-1.6rc1 to see if your error still
> occurs.  If so, it may be good to isolate the issue to a small snippet
> that can reproduce the issue and send that to the list.  It's possible
> that there could be some but unrelated to threading as well.
> 
> On Mon, Dec 6, 2010 at 1:54 PM, Martin Schatz <mschatz at cs.utexas.edu> wrote:
>> Hi Jonathan,
>>        Thanks for the documentation clarification, I was a bit confused on that.  Indeed, my program requires the MPI_THREAD_MULTIPLE safety level as calls can occur simultaneously.  I am checking the output of the MPI_Init_thread and making sure that the result is MPI_THREAD_MULTIPLE so I think I am getting the right level.  Originally, I did have the setting to MPI_THREAD_SERIALIZED, but then I tried running my program with the safety level of MPI_THREAD_MULTIPLE and obtained the same error referenced in the previous email.  I will try rebuilding and rerunning again to double-check everything, but assuming the error still occurs, are there other plausible reasons for this behavior?  The version of mvapich2 I am using is version 1.2.  I know that this is a bit outdated, but hopefully it shouldn't be too outdated.  Thanks for all the help.
>> 
>> Martin
>> 
>> On Dec 6, 2010, at 12:23 PM, Jonathan Perkins wrote:
>> 
>>> Hi Martin.
>>> 
>>> Below is a snippet of documentation related to MPI_Init_thread.
>>> http://www.mpi-forum.org/docs/mpi22-report/node260.htm#Node260
>>> 
>>> int MPI_Init_thread(int *argc, char *((*argv)[]), int required, int *provided)
>>> 
>>> MPI_THREAD_SINGLE
>>> Only one thread will execute.
>>> 
>>> MPI_THREAD_FUNNELED
>>> The process may be multi-threaded, but the application must
>>> ensure that only the main thread makes MPI calls (for the de nition of
>>> main thread,
>>> see MPI_IS_THREAD_MAIN on page 386).
>>> 
>>> MPI_THREAD_SERIALIZED
>>> The process may be multi-threaded, and multiple threads may
>>> make MPI calls, but only one at a time: MPI calls are not made concurrently from
>>> two distinct threads (all MPI calls are \serialized").
>>> 
>>> MPI_THREAD_MULTIPLE
>>> Multiple threads may call MPI, with no restrictions.
>>> 
>>> When passing in one of the different thread levels, you're telling the
>>> MPI implementation what level of support is required based on the way
>>> your code is designed.  If you have multiple threads making MPI calls
>>> simultaneously you must have MPI_THREAD_MULTIPLE support.  If your
>>> threads have locks that ensure that only one thread is making a call
>>> at a time, then you can use MPI_THREAD_SERIALIZED support.
>>> 
>>> It sounds like you might need MPI_THREAD_MULTIPLE support since you
>>> mentioned that there is no locking going on.  Do you have multiple
>>> threads making MPI calls or only one?  Are you checking the return
>>> value from MPI_Init_thread (the provided parameter)?  I believe that
>>> you may not be getting the threaded level that your app requires.
>>> Can you let us know what version of mvapich2 you are using?
>>> 
>>> Let me know if this information helps or if you have any further questions.
>>> 
>>> On Mon, Dec 6, 2010 at 12:12 PM, Martin Schatz <mschatz at cs.utexas.edu> wrote:
>>>> Hello,
>>>> My name is Martin Schatz and I am currently working on a C++ project that
>>>> uses the mvapich2 library.  I have been running into some problems while
>>>> running my code and cannot for the life of me figure out what is going on.
>>>>  I was hoping that maybe someone on this list could help out.  If this is
>>>> the wrong mailing list, please refer me to the correct one.
>>>> For background, the project I am working on requires a thread-safe
>>>> implementation of MPI as multiple threads (pthreads are used) are making MPI
>>>> calls at the same time.  Currently, I have the level of thread safety set to
>>>> MPI_THREAD_SERIALIZED, though I have tried the same code with the higher
>>>> MPI_THREAD_MULTIPLE safety level (but I do not think the MULTIPLE level is
>>>> necessary as I am ok with the calls occurring in some serial fashion).  When
>>>> running my program using mpirun with debugging turned on, I see no
>>>> indication that something is wrong, however when I run the program in an
>>>> large parallel environment (TACC machines), the following error occurs:
>>>> Exit code -5 signaled from i182-111.ranger.tacc.utexas.edu
>>>> Killing remote processes...current bytes -56, total bytes 0, remote id 11
>>>> Assertion failed in file ch3_smp_progress.c at line 2328:
>>>> s_current_bytes[vc->smp.local_nodes] == 0
>>>> internal ABORT - process 12MPI process terminated unexpectedly
>>>> DONE
>>>> This error does not occur in a specific place during the execution
>>>> (everything seems to be working fine when it occurs).  I was wondering if
>>>> anyone on this list could explain what this error means and perhaps some
>>>> tips on tracking down the bug.  Also, I am not using explicit locks in the
>>>> program around the MPI calls.  I thought that by specifying the level of
>>>> thread safety in MPI_thread_init, it would be taken care of for me.  Is this
>>>> in error?  Thanks for any guidance you can provide.
>>>> 
>>>> Martin Schatz
>>>> _______________________________________________
>>>> mvapich-discuss mailing list
>>>> mvapich-discuss at cse.ohio-state.edu
>>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>> 
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Jonathan Perkins
>>> http://www.cse.ohio-state.edu/~perkinjo
>> 
>> 
>> 
> 
> 
> 
> -- 
> Jonathan Perkins
> http://www.cse.ohio-state.edu/~perkinjo




More information about the mvapich-discuss mailing list