[Mvapich-discuss] Recommended configure options for MPICH 4.3.x with Valgrind or address-sanitizer
Eric Chamberland
Eric.Chamberland at giref.ulaval.ca
Wed Oct 1 23:06:40 EDT 2025
Hi,
I have been building MPICH with the following configure options for a
long time, mainly to keep my code “Valgrind-clean”:
===
./configure \
--enable-g=dbg,meminit \
--with-device=ch3:sock \
--enable-romio
===
This setup worked reasonably well in the past, but recently I’ve been
seeing occasional errors with address-sanitizer or valgrind (with 4.3.0
on a single node) such as:
===
Fatal error in internal_Allreduce_c: Unknown error class, error stack:
internal_Allreduce_c(347)...................: MPI_Allreduce_c(sendbuf=0x7ffdeb0b8e90, recvbuf=0x7ffdeb0b8e98, count=1, dtype=0x4c00083a, MPI_SUM, comm=0x84000003) failed
MPIR_Allreduce_impl(4826)...................:
MPIR_Allreduce_allcomm_auto(4732)...........:
MPIR_Allreduce_intra_recursive_doubling(115):
MPIC_Sendrecv(266)..........................:
MPIC_Wait(90)...............................:
MPIR_Wait(751)..............................:
MPIR_Wait_state(708)........................:
MPIDI_CH3i_Progress_wait(187)...............: an error occurred while handling an event returned by MPIDI_CH3I_Sock_Wait()
MPIDI_CH3I_Progress_handle_sock_event(385)..:
MPIDI_CH3I_Socki_handle_read(3647)..........: connection failure (set=0,sock=1,errno=104:Connection reset by peer)
===
Is CH3 considered legacy?
I would like to also ask:
1. What are the recommended configure options in 2025 for building
MPICH in a way that works well with Valgrind?
2. Is it preferable now to move to CH4 (e.g. ch4:ofi or ch4:shm) when
debugging with Valgrind?
3. Are there any other options (besides --enable-g=dbg,meminit) that
you would suggest for catching memory errors while keeping Valgrind
reports as clean as possible?
4. Is
https://urldefense.com/v3/__https://github.com/pmodels/mpich/blob/main/doc/wiki/design/Support_for_Debugging_Memory_Allocation.md__;!!KGKeukY!0uZHEHtZEaga1beOpdYFXpq7WNGp5jNAW8wQaJk8wgYLGwAEf-QD8rrTOQF7SYFYfdxC1lVvpP3XqxhRMeBGqXCTDdN2eE6IFMZP04X4lX-e$
up-to-date?
Any guidance on the “best practice” configuration for this use case
would be greatly appreciated.
PETSc guys have some options about debug
(https://urldefense.com/v3/__https://gitlab.com/petsc/petsc/-/blob/main/config/BuildSystem/config/packages/MPICH.py*L94__;Iw!!KGKeukY!0uZHEHtZEaga1beOpdYFXpq7WNGp5jNAW8wQaJk8wgYLGwAEf-QD8rrTOQF7SYFYfdxC1lVvpP3XqxhRMeBGqXCTDdN2eE6IFMZP07bPQizu$ )
but still uses CH3 by default.
Thanks a lot,
Eric
--
Eric Chamberland, ing., M. Ing
Professionnel de recherche
GIREF/Université Laval
(418) 656-2131 poste 41 22 42
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20251001/3b1d6924/attachment.html>
More information about the Mvapich-discuss
mailing list