[mvapich-discuss] MVAPICH error question
Martin Schatz
mschatz at cs.utexas.edu
Mon Dec 6 14:38:03 EST 2010
Thanks for your help, I think my Makefiles didn't notice when I updated the change to MPI_THREAD_MULTIPLE. Things seem to be running smoothely now, thanks for all your help.
Martin
On Dec 6, 2010, at 1:09 PM, Jonathan Perkins wrote:
> Firstly, I suggest trying mvapich2-1.6rc1 to see if your error still
> occurs. If so, it may be good to isolate the issue to a small snippet
> that can reproduce the issue and send that to the list. It's possible
> that there could be some but unrelated to threading as well.
>
> On Mon, Dec 6, 2010 at 1:54 PM, Martin Schatz <mschatz at cs.utexas.edu> wrote:
>> Hi Jonathan,
>> Thanks for the documentation clarification, I was a bit confused on that. Indeed, my program requires the MPI_THREAD_MULTIPLE safety level as calls can occur simultaneously. I am checking the output of the MPI_Init_thread and making sure that the result is MPI_THREAD_MULTIPLE so I think I am getting the right level. Originally, I did have the setting to MPI_THREAD_SERIALIZED, but then I tried running my program with the safety level of MPI_THREAD_MULTIPLE and obtained the same error referenced in the previous email. I will try rebuilding and rerunning again to double-check everything, but assuming the error still occurs, are there other plausible reasons for this behavior? The version of mvapich2 I am using is version 1.2. I know that this is a bit outdated, but hopefully it shouldn't be too outdated. Thanks for all the help.
>>
>> Martin
>>
>> On Dec 6, 2010, at 12:23 PM, Jonathan Perkins wrote:
>>
>>> Hi Martin.
>>>
>>> Below is a snippet of documentation related to MPI_Init_thread.
>>> http://www.mpi-forum.org/docs/mpi22-report/node260.htm#Node260
>>>
>>> int MPI_Init_thread(int *argc, char *((*argv)[]), int required, int *provided)
>>>
>>> MPI_THREAD_SINGLE
>>> Only one thread will execute.
>>>
>>> MPI_THREAD_FUNNELED
>>> The process may be multi-threaded, but the application must
>>> ensure that only the main thread makes MPI calls (for the de nition of
>>> main thread,
>>> see MPI_IS_THREAD_MAIN on page 386).
>>>
>>> MPI_THREAD_SERIALIZED
>>> The process may be multi-threaded, and multiple threads may
>>> make MPI calls, but only one at a time: MPI calls are not made concurrently from
>>> two distinct threads (all MPI calls are \serialized").
>>>
>>> MPI_THREAD_MULTIPLE
>>> Multiple threads may call MPI, with no restrictions.
>>>
>>> When passing in one of the different thread levels, you're telling the
>>> MPI implementation what level of support is required based on the way
>>> your code is designed. If you have multiple threads making MPI calls
>>> simultaneously you must have MPI_THREAD_MULTIPLE support. If your
>>> threads have locks that ensure that only one thread is making a call
>>> at a time, then you can use MPI_THREAD_SERIALIZED support.
>>>
>>> It sounds like you might need MPI_THREAD_MULTIPLE support since you
>>> mentioned that there is no locking going on. Do you have multiple
>>> threads making MPI calls or only one? Are you checking the return
>>> value from MPI_Init_thread (the provided parameter)? I believe that
>>> you may not be getting the threaded level that your app requires.
>>> Can you let us know what version of mvapich2 you are using?
>>>
>>> Let me know if this information helps or if you have any further questions.
>>>
>>> On Mon, Dec 6, 2010 at 12:12 PM, Martin Schatz <mschatz at cs.utexas.edu> wrote:
>>>> Hello,
>>>> My name is Martin Schatz and I am currently working on a C++ project that
>>>> uses the mvapich2 library. I have been running into some problems while
>>>> running my code and cannot for the life of me figure out what is going on.
>>>> I was hoping that maybe someone on this list could help out. If this is
>>>> the wrong mailing list, please refer me to the correct one.
>>>> For background, the project I am working on requires a thread-safe
>>>> implementation of MPI as multiple threads (pthreads are used) are making MPI
>>>> calls at the same time. Currently, I have the level of thread safety set to
>>>> MPI_THREAD_SERIALIZED, though I have tried the same code with the higher
>>>> MPI_THREAD_MULTIPLE safety level (but I do not think the MULTIPLE level is
>>>> necessary as I am ok with the calls occurring in some serial fashion). When
>>>> running my program using mpirun with debugging turned on, I see no
>>>> indication that something is wrong, however when I run the program in an
>>>> large parallel environment (TACC machines), the following error occurs:
>>>> Exit code -5 signaled from i182-111.ranger.tacc.utexas.edu
>>>> Killing remote processes...current bytes -56, total bytes 0, remote id 11
>>>> Assertion failed in file ch3_smp_progress.c at line 2328:
>>>> s_current_bytes[vc->smp.local_nodes] == 0
>>>> internal ABORT - process 12MPI process terminated unexpectedly
>>>> DONE
>>>> This error does not occur in a specific place during the execution
>>>> (everything seems to be working fine when it occurs). I was wondering if
>>>> anyone on this list could explain what this error means and perhaps some
>>>> tips on tracking down the bug. Also, I am not using explicit locks in the
>>>> program around the MPI calls. I thought that by specifying the level of
>>>> thread safety in MPI_thread_init, it would be taken care of for me. Is this
>>>> in error? Thanks for any guidance you can provide.
>>>>
>>>> Martin Schatz
>>>> _______________________________________________
>>>> mvapich-discuss mailing list
>>>> mvapich-discuss at cse.ohio-state.edu
>>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Jonathan Perkins
>>> http://www.cse.ohio-state.edu/~perkinjo
>>
>>
>>
>
>
>
> --
> Jonathan Perkins
> http://www.cse.ohio-state.edu/~perkinjo
More information about the mvapich-discuss
mailing list