[mvapich-discuss] mvapich2-1.4 bug for profiling tool.

Dhabaleswar Panda panda at cse.ohio-state.edu
Tue Mar 2 21:35:57 EST 2010


Hi Anthony,

Thanks for sending us this patch. We had taken care of several fixes
related to MPE earlier to go into MVAPICH2 1.4.1 release. We have applied
your patch now. You can download the latest version (this includes your
patch) from mvapich2 trunk and double-check that everything works with MPE
and other profiling tools. If you have any additional suggestions, please
let us know. We are coming closer to 1.4.1 release, to be done in a few
days. All these changes will be reflected in this release.

Thanks,

DK

On Tue, 2 Mar 2010 chan at mcs.anl.gov wrote:

>
> My subscription to mvapich-discuss may have problem, so
> cc to panda just in case.
>
> Attached is a patch against mvapich2-1.4 that replaces
> the MPI calls by its PMPI calls counterpart.  Without
> the patch, any MPE logging application (i.e. created
> by "mpicc -mpe=mpilog") that makes MPI_Comm_split(),
> MPI_Comm_dup() and MPI_Comm_create() will fail with
> following backtrace.  The following is a simple
> MPI program distributed with MPICH2/MPE
>
> .../examples_logging> mpiexec -n 4 comm1_isr | & ~/bin/bt2line comm1_isr
> ......
> 	At [1]: comm1_isr(CLOG_CommSet_get_IDs+0x5c)[/scratch/jhedden/build/mvapich2/mvapich2-1.4/src/mpe2/src/logging/src/clog_commset.c:466]
> 	At [2]: comm1_isr(MPI_Comm_test_inter+0xc9)[/scratch/jhedden/build/mvapich2/mvapich2-1.4/src/mpe2/src/wrappers/src/log_mpi_core.c:2698]
> 	At [3]: comm1_isr(PMPI_Comm_split+0x917)[/scratch/jhedden/build/mvapich2/mvapich2-1.4/src/mpi/comm/comm_split.c:423]
> 	At [4]: comm1_isr(MPI_Comm_split+0x10e)[/scratch/jhedden/build/mvapich2/mvapich2-1.4/src/mpe2/src/wrappers/src/log_mpi_core.c:2668]
> 	At [5]: comm1_isr(main+0x182)[/scratch/chan/mpe_work/examples_logging/comm1_isr.c:43]
> 	At [6]: /lib64/libc.so.6(__libc_start_main+0xf4)[??:0]
> 	At [7]: comm1_isr[??:0]
> ......
>
> This affects not only MPE but also other profiling tools.
> Without the patch, it could lead to circular profiling
> that some profiling tools can't easily handle.  Some workaround
> exists in certain situations, but the workaround will have
> performance penalty for profiling tools, so MPE that is distributed
> with MPICH2 assumes MPI implementation cares about performance and
> does not make MPI calls within the implementation, hence the error
> we see here.
>
> Hope mvapich team will include the patch in their future release,
> i.e. uses PMPI calls instead of MPI calls in the future update.
>
> Thanks,
> A.Chan



More information about the mvapich-discuss mailing list