[mvapich-discuss] Got FATAL event 0

Dhabaleswar Panda panda at cse.ohio-state.edu
Fri Sep 19 20:17:33 EDT 2008


Hi Adam,

Thanks for reporting this. As you know, MVAPICH 0.9.7 and 0.9.9 are older
versions (almost 1.5 to 2.0 years old). We are coming closer to 1.1
release. You need to upgrade your MVAPICH stack :-) A lot of enhancements
(feature wise) and bug fixes have happened to MVAPICH library (including
MPI_BCast) in the recent years.  Can you verify whether this error happens
with the latest MVAPICH 1.0.1 release. If this error happens with the
1.0.1 release, it will be much more quicker to analyze and debug it.

Thanks,

DK

On Fri, 19 Sep 2008, Adam Moody wrote:

> Hello MVAPICH team,
> I have a user hitting some errors, and I'm hoping you may have some
> insight.  When running with MVAPICH1-0.9.7, the user sees the following
> non-fatal error message on occasion:
>
>     Error getting event!
>     [0] Got unknown event 1075841344 (Unknown) ... continuing ...
>
> With 0.9.9 (and PTMALLOC disabled), the user sees the following fatal
> error with the same frequency as the above message:
>
>     [0] Got FATAL event 0 (CQ Error)
>
> This error is detected by the async_thread function in viachek.c.  The
> series of MPI calls the user app has made at this point looks like the
> following:
>
>  MPI_Init(&argc, &argv);
>  MPI_Comm_rank(MPI_COMM_WORLD, &myRank);
>  MPI_Initialized(&initialized);
>  MPI_Comm_size(MPI_COMM_WORLD, &d_size)
>  MPI_Comm_rank(MPI_COMM_WORLD, &d_rank)
>  MPI_Bcast(const_cast<char*>(d_key), SECURE_KEY_SIZE, MPI_CHAR,
>                0, MPI_COMM_WORLD);
>  MPI_Bcast(&length, 1, MPI_INT, 0, MPI_COMM_WORLD);
>  MPI_Bcast(const_cast<char*>(d_parentUrl.c_str()), length, MPI_CHAR, 0,
>                  MPI_COMM_WORLD);
>  MPI_Bcast(&length, 1, MPI_INT, 0, MPI_COMM_WORLD);
>  MPI_Bcast(const_cast<char*>(d_rank0Url.c_str()), length, MPI_CHAR, 0,
>                  MPI_COMM_WORLD);
>
> Have others reported this problem before?  Any idea on how to fix it?
> Thanks again,
> -Adam
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>



More information about the mvapich-discuss mailing list