[mvapich-discuss] default envs for mpiexec

wei huang huanwei at cse.ohio-state.edu
Fri Feb 29 13:26:09 EST 2008


Hi Christian,

So you can even see the hang with normal osu_benchmarks? Is your adapter
updated with the latest firmware?

I would suggest trying to solve the hanging problem rather than disabling
SRQ, because all our local testings here suggest em64t+25208 adapter will
work fine. And SRQ benefits scalability and performance on larger scale
clusters.

If you really need to disable SRQ for now, you can apply this patch:


--- src/mpid/osu_ch3/channels/mrail/src/gen2/ibv_param.c.orig   2008-02-29 12:59:17.000000000 -0500
+++ src/mpid/osu_ch3/channels/mrail/src/gen2/ibv_param.c        2008-02-29 12:59:31.000000000 -0500
@@ -463,7 +463,7 @@
     if ((value = getenv("MV2_USE_SRQ")) != NULL) {
        proc->has_srq = !!atoi(value);
     } else {
-       proc->has_srq = 1;
+       proc->has_srq = 0;
     }

     if ( proc->has_srq && proc->hca_type != PATH_HT

Thanks.

Regards,
Wei Huang

774 Dreese Lab, 2015 Neil Ave,
Dept. of Computer Science and Engineering
Ohio State University
OH 43210
Tel: (614)292-8501


On Thu, 28 Feb 2008, Christian Guggenberger wrote:

> Hi Wei,
>
> > Are you willing to change the code and recompile? I can send you a patch
> > to disable it in code.
> >
>
> this would be an option as well. I was thinking of preparing
> mvapich2-1.0.2p1 in the near future, so I'd appreciate your patch.
>
> As the problem with SRQ is only visible on em64t with pci-ex adapters
> (25208), but not on Opteron with pci-x (23108), I could keep it enabled
> for the latter.
>
> You might probably be interested, so I'll describe the SRQ problem we
> are seeing on em64t with 25208 adapters (OFED-1.2.5.5, SLES9 SP4):
>
> Even simplest MPI programs occasionnaly hang in MPI_FINALIZE (even for
> intra-node only communciations). We have not been able to track this
> down further, but disabling SRQ helps.
>
> ->   pthread_cond_wait,  FP=7fbfffea20
>      ibv_cmd_destroy_srq, FP=7fbfffea80
>      mthca_destroy_srq,  FP=7fbfffea90
>      ibv_destroy_srq,    FP=7fbfffeaa0
>      MPIDI_CH3I_RMDA_finalize, FP=7fbfffeb10
>      MPIDI_CH3_Finalize, FP=7fbfffeb30
>      MPID_Finalize,      FP=7fbfffeb50
>      PMPI_Finalize,      FP=7fbfffeb90
>      pmpi_finalize_,	 FP=7fbfffeba0
>
> cheers.
>  - Christian
>




More information about the mvapich-discuss mailing list