[Mvapich-discuss] Recommended configure options for MPICH 4.3.x with Valgrind or address-sanitizer

Eric Chamberland Eric.Chamberland at giref.ulaval.ca
Wed Oct 1 23:06:40 EDT 2025


Hi,

I have been building MPICH with the following configure options for a 
long time, mainly to keep my code “Valgrind-clean”:

===
./configure \
   --enable-g=dbg,meminit \
   --with-device=ch3:sock \
   --enable-romio

===

This setup worked reasonably well in the past, but recently I’ve been 
seeing occasional errors with address-sanitizer or valgrind (with 4.3.0 
on a single node) such as:

===

Fatal error in internal_Allreduce_c: Unknown error class, error stack:
internal_Allreduce_c(347)...................: MPI_Allreduce_c(sendbuf=0x7ffdeb0b8e90, recvbuf=0x7ffdeb0b8e98, count=1, dtype=0x4c00083a, MPI_SUM, comm=0x84000003) failed
MPIR_Allreduce_impl(4826)...................:
MPIR_Allreduce_allcomm_auto(4732)...........:
MPIR_Allreduce_intra_recursive_doubling(115):
MPIC_Sendrecv(266)..........................:
MPIC_Wait(90)...............................:
MPIR_Wait(751)..............................:
MPIR_Wait_state(708)........................:
MPIDI_CH3i_Progress_wait(187)...............: an error occurred while handling an event returned by MPIDI_CH3I_Sock_Wait()
MPIDI_CH3I_Progress_handle_sock_event(385)..:
MPIDI_CH3I_Socki_handle_read(3647)..........: connection failure (set=0,sock=1,errno=104:Connection reset by peer)

===

Is CH3 considered legacy?

I would like to also ask:

  1. What are the recommended configure options in 2025 for building 
MPICH in a way that works well with Valgrind?

  2. Is it preferable now to move to CH4 (e.g. ch4:ofi or ch4:shm) when 
debugging with Valgrind?

  3. Are there any other options (besides --enable-g=dbg,meminit) that 
you would suggest for catching memory errors while keeping Valgrind 
reports as clean as possible?

  4. Is 
https://urldefense.com/v3/__https://github.com/pmodels/mpich/blob/main/doc/wiki/design/Support_for_Debugging_Memory_Allocation.md__;!!KGKeukY!0uZHEHtZEaga1beOpdYFXpq7WNGp5jNAW8wQaJk8wgYLGwAEf-QD8rrTOQF7SYFYfdxC1lVvpP3XqxhRMeBGqXCTDdN2eE6IFMZP04X4lX-e$  
up-to-date?

Any guidance on the “best practice” configuration for this use case 
would be greatly appreciated.

PETSc guys have some options about debug 
(https://urldefense.com/v3/__https://gitlab.com/petsc/petsc/-/blob/main/config/BuildSystem/config/packages/MPICH.py*L94__;Iw!!KGKeukY!0uZHEHtZEaga1beOpdYFXpq7WNGp5jNAW8wQaJk8wgYLGwAEf-QD8rrTOQF7SYFYfdxC1lVvpP3XqxhRMeBGqXCTDdN2eE6IFMZP07bPQizu$ ) 
but still uses CH3 by default.

Thanks a lot,

Eric

-- 
Eric Chamberland, ing., M. Ing
Professionnel de recherche
GIREF/Université Laval
(418) 656-2131 poste 41 22 42
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20251001/3b1d6924/attachment.html>


More information about the Mvapich-discuss mailing list