[mvapich-discuss] mvapich2-1.4 bug for profiling tool.

Anthony Chan chan at mcs.anl.gov
Wed Mar 3 13:28:04 EST 2010


Hi DK,

Thanks for the fast response.  Will let you know if I see
any more problem.

Thanks,
A.Chan
----- "Dhabaleswar Panda" <panda at cse.ohio-state.edu> wrote:

> Hi Anthony,
> 
> Thanks for sending us this patch. We had taken care of several fixes
> related to MPE earlier to go into MVAPICH2 1.4.1 release. We have
> applied
> your patch now. You can download the latest version (this includes
> your
> patch) from mvapich2 trunk and double-check that everything works with
> MPE
> and other profiling tools. If you have any additional suggestions,
> please
> let us know. We are coming closer to 1.4.1 release, to be done in a
> few
> days. All these changes will be reflected in this release.
> 
> Thanks,
> 
> DK
> 
> On Tue, 2 Mar 2010 chan at mcs.anl.gov wrote:
> 
> >
> > My subscription to mvapich-discuss may have problem, so
> > cc to panda just in case.
> >
> > Attached is a patch against mvapich2-1.4 that replaces
> > the MPI calls by its PMPI calls counterpart.  Without
> > the patch, any MPE logging application (i.e. created
> > by "mpicc -mpe=mpilog") that makes MPI_Comm_split(),
> > MPI_Comm_dup() and MPI_Comm_create() will fail with
> > following backtrace.  The following is a simple
> > MPI program distributed with MPICH2/MPE
> >
> > .../examples_logging> mpiexec -n 4 comm1_isr | & ~/bin/bt2line
> comm1_isr
> > ......
> > 	At [1]:
> comm1_isr(CLOG_CommSet_get_IDs+0x5c)[/scratch/jhedden/build/mvapich2/mvapich2-1.4/src/mpe2/src/logging/src/clog_commset.c:466]
> > 	At [2]:
> comm1_isr(MPI_Comm_test_inter+0xc9)[/scratch/jhedden/build/mvapich2/mvapich2-1.4/src/mpe2/src/wrappers/src/log_mpi_core.c:2698]
> > 	At [3]:
> comm1_isr(PMPI_Comm_split+0x917)[/scratch/jhedden/build/mvapich2/mvapich2-1.4/src/mpi/comm/comm_split.c:423]
> > 	At [4]:
> comm1_isr(MPI_Comm_split+0x10e)[/scratch/jhedden/build/mvapich2/mvapich2-1.4/src/mpe2/src/wrappers/src/log_mpi_core.c:2668]
> > 	At [5]:
> comm1_isr(main+0x182)[/scratch/chan/mpe_work/examples_logging/comm1_isr.c:43]
> > 	At [6]: /lib64/libc.so.6(__libc_start_main+0xf4)[??:0]
> > 	At [7]: comm1_isr[??:0]
> > ......
> >
> > This affects not only MPE but also other profiling tools.
> > Without the patch, it could lead to circular profiling
> > that some profiling tools can't easily handle.  Some workaround
> > exists in certain situations, but the workaround will have
> > performance penalty for profiling tools, so MPE that is distributed
> > with MPICH2 assumes MPI implementation cares about performance and
> > does not make MPI calls within the implementation, hence the error
> > we see here.
> >
> > Hope mvapich team will include the patch in their future release,
> > i.e. uses PMPI calls instead of MPI calls in the future update.
> >
> > Thanks,
> > A.Chan


More information about the mvapich-discuss mailing list