[mvapich-discuss] MVAPICH error question

Martin Schatz mschatz at cs.utexas.edu
Mon Dec 6 13:54:19 EST 2010


Hi Jonathan,
	Thanks for the documentation clarification, I was a bit confused on that.  Indeed, my program requires the MPI_THREAD_MULTIPLE safety level as calls can occur simultaneously.  I am checking the output of the MPI_Init_thread and making sure that the result is MPI_THREAD_MULTIPLE so I think I am getting the right level.  Originally, I did have the setting to MPI_THREAD_SERIALIZED, but then I tried running my program with the safety level of MPI_THREAD_MULTIPLE and obtained the same error referenced in the previous email.  I will try rebuilding and rerunning again to double-check everything, but assuming the error still occurs, are there other plausible reasons for this behavior?  The version of mvapich2 I am using is version 1.2.  I know that this is a bit outdated, but hopefully it shouldn't be too outdated.  Thanks for all the help.

Martin

On Dec 6, 2010, at 12:23 PM, Jonathan Perkins wrote:

> Hi Martin.
> 
> Below is a snippet of documentation related to MPI_Init_thread.
> http://www.mpi-forum.org/docs/mpi22-report/node260.htm#Node260
> 
> int MPI_Init_thread(int *argc, char *((*argv)[]), int required, int *provided)
> 
> MPI_THREAD_SINGLE
> Only one thread will execute.
> 
> MPI_THREAD_FUNNELED
> The process may be multi-threaded, but the application must
> ensure that only the main thread makes MPI calls (for the denition of
> main thread,
> see MPI_IS_THREAD_MAIN on page 386).
> 
> MPI_THREAD_SERIALIZED
> The process may be multi-threaded, and multiple threads may
> make MPI calls, but only one at a time: MPI calls are not made concurrently from
> two distinct threads (all MPI calls are \serialized").
> 
> MPI_THREAD_MULTIPLE
> Multiple threads may call MPI, with no restrictions.
> 
> When passing in one of the different thread levels, you're telling the
> MPI implementation what level of support is required based on the way
> your code is designed.  If you have multiple threads making MPI calls
> simultaneously you must have MPI_THREAD_MULTIPLE support.  If your
> threads have locks that ensure that only one thread is making a call
> at a time, then you can use MPI_THREAD_SERIALIZED support.
> 
> It sounds like you might need MPI_THREAD_MULTIPLE support since you
> mentioned that there is no locking going on.  Do you have multiple
> threads making MPI calls or only one?  Are you checking the return
> value from MPI_Init_thread (the provided parameter)?  I believe that
> you may not be getting the threaded level that your app requires.
> Can you let us know what version of mvapich2 you are using?
> 
> Let me know if this information helps or if you have any further questions.
> 
> On Mon, Dec 6, 2010 at 12:12 PM, Martin Schatz <mschatz at cs.utexas.edu> wrote:
>> Hello,
>> My name is Martin Schatz and I am currently working on a C++ project that
>> uses the mvapich2 library.  I have been running into some problems while
>> running my code and cannot for the life of me figure out what is going on.
>>  I was hoping that maybe someone on this list could help out.  If this is
>> the wrong mailing list, please refer me to the correct one.
>> For background, the project I am working on requires a thread-safe
>> implementation of MPI as multiple threads (pthreads are used) are making MPI
>> calls at the same time.  Currently, I have the level of thread safety set to
>> MPI_THREAD_SERIALIZED, though I have tried the same code with the higher
>> MPI_THREAD_MULTIPLE safety level (but I do not think the MULTIPLE level is
>> necessary as I am ok with the calls occurring in some serial fashion).  When
>> running my program using mpirun with debugging turned on, I see no
>> indication that something is wrong, however when I run the program in an
>> large parallel environment (TACC machines), the following error occurs:
>> Exit code -5 signaled from i182-111.ranger.tacc.utexas.edu
>> Killing remote processes...current bytes -56, total bytes 0, remote id 11
>> Assertion failed in file ch3_smp_progress.c at line 2328:
>> s_current_bytes[vc->smp.local_nodes] == 0
>> internal ABORT - process 12MPI process terminated unexpectedly
>> DONE
>> This error does not occur in a specific place during the execution
>> (everything seems to be working fine when it occurs).  I was wondering if
>> anyone on this list could explain what this error means and perhaps some
>> tips on tracking down the bug.  Also, I am not using explicit locks in the
>> program around the MPI calls.  I thought that by specifying the level of
>> thread safety in MPI_thread_init, it would be taken care of for me.  Is this
>> in error?  Thanks for any guidance you can provide.
>> 
>> Martin Schatz
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>> 
>> 
> 
> 
> 
> -- 
> Jonathan Perkins
> http://www.cse.ohio-state.edu/~perkinjo




More information about the mvapich-discuss mailing list