[mvapich-discuss] hangs when running MUMPS w/ MVAPICH2.2 built for PSM

Hari Subramoni subramoni.1 at osu.edu
Mon Oct 17 17:44:07 EDT 2016


We resolved the issue off the list. The attached patch fixes it. It will be
available with the next release of MVAPICH2.

In the meantime, you can apply it from the top source directory of your
MVAPICH2 installation like this

patch -p1< psm.patch

Regards,
Hari.

On Thu, Oct 13, 2016 at 6:40 PM, Westlund, John A <john.a.westlund at intel.com
> wrote:

> Thanks Hari. I really appreciate the quick response.
>
>
>
> John
>
>
>
> *From:* hari.subramoni at gmail.com [mailto:hari.subramoni at gmail.com] *On
> Behalf Of *Hari Subramoni
> *Sent:* Thursday, October 13, 2016 2:50 PM
>
> *To:* Westlund, John A <john.a.westlund at intel.com>
> *Cc:* mvapich-discuss at cse.ohio-state.edu
> *Subject:* Re: [mvapich-discuss] hangs when running MUMPS w/ MVAPICH2.2
> built for PSM
>
>
>
> Many thanks for the details John. Let me try this out locally and see what
> could be going on.
>
>
>
> Thx,
>
> Hari.
>
>
>
> On Thu, Oct 13, 2016 at 4:29 PM, Westlund, John A <
> john.a.westlund at intel.com> wrote:
>
> Hi Hari,
>
>
>
> Here’s the info:
>
>
>
> 1.      Output of mpiname -a:
>
> -bash-4.2# mpiname -a
>
> MVAPICH2 2.2 Thu Sep 08 22:00:00 EST 2016 ch3:psm
>
>
>
> Compilation
>
> CC: gcc    -g -O3
>
> CXX: g++   -g -O3
>
> F77: gfortran   -g -O3
>
> FC: gfortran   -g -O3
>
>
>
> Configuration
>
> --prefix=/opt/intel/hpc-orchestrator/pub/mpi/mvapich2-psm-gnu-orch/2.2
> --enable-cxx --enable-g=dbg --with-device=ch3:psm --enable-fast=O3
>
>
>
> -bash-4.2# module swap gnu intel
>
>
>
> Due to MODULEPATH changes the following have been reloaded:
>
>   1) mvapich2/2.2
>
>
>
> -bash-4.2# mpiname -a
>
> MVAPICH2 2.2 Thu Sep 08 22:00:00 EST 2016 ch3:psm
>
>
>
> Compilation
>
> CC: icc    -g -O3
>
> CXX: icpc   -g -O3
>
> F77: ifort   -g -O3
>
> FC: ifort   -g -O3
>
>
>
> Configuration
>
> --prefix=/opt/intel/hpc-orchestrator/pub/mpi/mvapich2-psm-intel-orch/2.2
> --enable-cxx --enable-g=dbg --with-device=ch3:psm --enable-fast=O3
>
>
>
> 2.      Scale (procs/nodes):  2 procs / node running on 2 nodes
>
> 3.      Build details:
>
> The following is for Intel compilers (need scalapack)
>
> wget http://mumps.enseeiht.fr/MUMPS_5.0.2.tar.gz
>
> tar xf MUMPS_5.0.2.tar.gz
>
> cd MUMPS_5.0.2
>
> cp Make.inc/Makefile.INTEL.PAR Makefile.inc
>
> make
>
> cd examples
>
> make
>
> mpirun -np 2 ./dsimpletest < input_simpletest_real
>
>
>
> For GCC:
>
> need scalapack, and openblas
>
> wget …
>
> tar…
>
> cd
>
> create a Makefile.inc with:
>
> #  This file is part of MUMPS 5.0.0, released
>
> #  on Fri Feb 20 08:19:56 UTC 2015
>
> #
>
> #Begin orderings
>
>
>
> # NOTE that PORD is distributed within MUMPS by default. If you would like
> to
>
> # use other orderings, you need to obtain the corresponding package and
> modify
>
> # the variables below accordingly.
>
> # For example, to have Metis available within MUMPS:
>
> #          1/ download Metis and compile it
>
> #          2/ uncomment (suppress # in first column) lines
>
> #             starting with LMETISDIR,  LMETIS
>
> #          3/ add -Dmetis in line ORDERINGSF
>
> #             ORDERINGSF  = -Dpord -Dmetis
>
> #          4/ Compile and install MUMPS
>
> #             make clean; make   (to clean up previous installation)
>
> #
>
> #          Metis/ParMetis and SCOTCH/PT-SCOTCH (ver 6.0 and later)
> orderings are now available for MUMPS.
>
> #
>
>
>
> #SCOTCHDIR  = ${HOME}/scotch_6.0
>
> #ISCOTCH    = -I$(SCOTCHDIR)/include  # Should be provided for pt-scotch
> (not needed for Scotch)
>
> #
>
> # You have to choose one among the following two lines depending on
>
> # the type of analysis you want to perform. If you want to perform only
>
> # sequential analysis choose the first (remember to add -Dscotch in the
> ORDERINGSF
>
> # variable below); for both parallel and sequential analysis choose the
> second
>
> # line (remember to add -Dptscotch in the ORDERINGSF variable below)
>
>
>
> #LSCOTCH    = -L$(SCOTCHDIR)/lib -lesmumps -lscotch -lscotcherr
>
> #LSCOTCH    = -L$(SCOTCHDIR)/lib -lptesmumps -lptscotch -lptscotcherr
> -lscotch
>
>
>
>
>
> LPORDDIR = $(topdir)/PORD/lib/
>
> IPORD    = -I$(topdir)/PORD/include/
>
> LPORD    = -L$(LPORDDIR) -lpord
>
>
>
> #LMETISDIR = /local/metis/
>
> #IMETIS    = # should be provided if you use parmetis, to access parmetis.h
>
>
>
> # You have to choose one among the following two lines depending on
>
> # the type of analysis you want to perform. If you want to perform only
>
> # sequential analysis choose the first (remember to add -Dmetis in the
> ORDERINGSF
>
> # variable below); for both parallel and sequential analysis choose the
> second
>
> # line (remember to add -Dparmetis in the ORDERINGSF variable below)
>
>
>
> #LMETIS    = -L$(LMETISDIR) -lmetis
>
> #LMETIS    = -L$(LMETISDIR) -lparmetis -lmetis
>
>
>
> # The following variables will be used in the compilation process.
>
> # Please note that -Dptscotch and -Dparmetis imply -Dscotch and -Dmetis
> respectively.
>
> #ORDERINGSF = -Dscotch -Dmetis -Dpord -Dptscotch -Dparmetis
>
> ORDERINGSF  = -Dpord
>
> ORDERINGSC  = $(ORDERINGSF)
>
>
>
> LORDERINGS = $(LMETIS) $(LPORD) $(LSCOTCH)
>
> IORDERINGSF = $(ISCOTCH)
>
> IORDERINGSC = $(IMETIS) $(IPORD) $(ISCOTCH)
>
>
>
> #End orderings
>
> ########################################################################
>
> ############################################################
> ####################
>
>
>
> PLAT    =
>
> LIBEXT  = .a
>
> OUTC    = -o
>
> OUTF    = -o
>
> RM = /bin/rm -f
>
> CC = mpicc
>
> FC = mpif77
>
> FL = mpif77
>
> AR = ar vr
>
> #RANLIB = ranlib
>
> RANLIB  = echo
>
> SCALAP  = -L$(SCALAPACK_LIB) -L$(OPENBLAS_LIB) -lscalapack -lopenblas
>
> INCPAR = -I$(MPI_DIR)/include
>
> # LIBPAR = $(SCALAP)  -L/usr/local/lib/ -llamf77mpi -lmpi -llam
>
> LIBPAR = $(SCALAP)  -L$(MPI_DIR)/lib -lmpi
>
> #LIBPAR = -lmpi++ -lmpi -ltstdio -ltrillium -largs -lt
>
> INCSEQ = -I$(topdir)/libseq
>
> LIBSEQ  =  -L$(topdir)/libseq -lmpiseq
>
> LIBBLAS = -lopenblas
>
> LIBOTHERS = -lpthread -lgomp
>
> #Preprocessor defs for calling Fortran from C (-DAdd_ or -DAdd__ or
> -DUPPER)
>
> CDEFS   = -DAdd_
>
>
>
> #Begin Optimized options
>
> #OPTF    = -O  -DALLOW_NON_INIT -nofor_main
>
> #OPTL    = -O -nofor_main
>
> OPTF    = -O  -DALLOW_NON_INIT
>
> OPTL    = -O
>
> OPTC    = -O
>
> #End Optimized options
>
> INCS = $(INCPAR)
>
> LIBS = $(LIBPAR)
>
> LIBSEQNEEDED =
>
> make
>
> cd examples
>
> make
>
> mpirun -np 2 ./dsimpletest < input_simpletest_real
>
>
>
>
>
> Thanks,
>
> John
>
>
>
>
>
> *From:* hari.subramoni at gmail.com [mailto:hari.subramoni at gmail.com] *On
> Behalf Of *Hari Subramoni
> *Sent:* Thursday, October 13, 2016 10:54 AM
> *To:* Westlund, John A <john.a.westlund at intel.com>
> *Cc:* mvapich-discuss at cse.ohio-state.edu
> *Subject:* Re: [mvapich-discuss] hangs when running MUMPS w/ MVAPICH2.2
> built for PSM
>
>
>
> Hello John,
>
>
>
> Thanks for the report. Sorry to hear that MV2 2.2 is hanging. We've not
> seen this before.
>
>
>
> Can you send us the following details
>
>
>
> 1. Output of mpiname -a
>
> 3. At what scale (number of processes / nodes) at which the issue occurs?
>
> 2. The source code and build instructions of "MUMPS" so that we can try it
> out locally?
>
>
>
> Thx,
> Hari.
>
>
>
> On Thu, Oct 13, 2016 at 1:41 PM, Westlund, John A <
> john.a.westlund at intel.com> wrote:
>
> Also posted this to the MUMPS list, but I’m only seeing these hangs on
> v2.2 -- it works with v2.1:
>
>
>
> I’ve been running tests using the MUMPS tests: csimpletest.F,
> dsimpletest.F, ssimpletest.F and zsimpletest.F -- and I’m getting
> successful runs using OpenMPI, or using MVAPICH2 v2.2 (built for verbs) on
> Mellanox. But on QLogic HW with a MVAPICH2 v2.2 built for PSM the above
> tests hang in the Factorization step:
>
>
>
> #  =================================================
>
> #  MUMPS compiled with option -DALLOW_NON_INIT
>
> #  =================================================
>
> # L U Solver for unsymmetric matrices
>
> # Type of parallelism: Working host
>
> #
>
> #  ****** ANALYSIS STEP ********
>
> #
>
> #  ... Structural symmetry (in percent)=   92
>
> #  ... No column permutation
>
> #  Ordering based on AMF
>
> #
>
> # Leaving analysis phase with  ...
>
> # INFOG(1)                                       =               0
>
> # INFOG(2)                                       =               0
>
> #  -- (20) Number of entries in factors (estim.) =              15
>
> #  --  (3) Storage of factors  (REAL, estimated) =              15
>
> #  --  (4) Storage of factors  (INT , estimated) =              59
>
> #  --  (5) Maximum frontal size      (estimated) =               3
>
> #  --  (6) Number of nodes in the tree           =               3
>
> #  -- (32) Type of analysis effectively used     =               1
>
> #  --  (7) Ordering option effectively used      =               2
>
> # ICNTL(6) Maximum transversal option            =               0
>
> # ICNTL(7) Pivot order option                    =               7
>
> # Percentage of memory relaxation (effective)    =              20
>
> # Number of level 2 nodes                        =               0
>
> # Number of split nodes                          =               0
>
> # RINFOG(1) Operations during elimination (estim)=   1.900D+01
>
> #  ** Rank of proc needing largest memory in IC facto        :         0
>
> #  ** Estimated corresponding MBYTES for IC facto            :         1
>
> #  ** Estimated avg. MBYTES per work. proc at facto (IC)     :         1
>
> #  ** TOTAL     space in MBYTES for IC factorization         :         4
>
> #  ** Rank of proc needing largest memory for OOC facto      :         0
>
> #  ** Estimated corresponding MBYTES for OOC facto           :         1
>
> #  ** Estimated avg. MBYTES per work. proc at facto (OOC)    :         1
>
> #  ** TOTAL     space in MBYTES for OOC factorization        :         4
>
> #  ELAPSED TIME IN ANALYSIS DRIVER=       0.0020
>
> #
>
> #  ****** FACTORIZATION STEP ********
>
> #
>
> #
>
> #  GLOBAL STATISTICS PRIOR NUMERICAL FACTORIZATION ...
>
> #  NUMBER OF WORKING PROCESSES              =             4
>
> #  OUT-OF-CORE OPTION (ICNTL(22))           =             0
>
> #  REAL SPACE FOR FACTORS                   =            15
>
> #  INTEGER SPACE FOR FACTORS                =            59
>
> #  MAXIMUM FRONTAL SIZE (ESTIMATED)         =             3
>
> #  NUMBER OF NODES IN THE TREE              =             3
>
> #  MEMORY ALLOWED (MB -- 0: N/A )           =             0
>
> #  Convergence error after scaling for ONE-NORM (option 7/8)   = 0.38D+00
>
> #  Maximum effective relaxed size of S              =           359
>
> #  Average effective relaxed size of S              =           351
>
> #  GLOBAL TIME FOR MATRIX DISTRIBUTION       =      0.0000
>
> #  ** Memory relaxation parameter ( ICNTL(14)  )            :        20
>
> #  ** Rank of processor needing largest memory in facto     :         0
>
> #  ** Space in MBYTES used by this processor for facto      :         1
>
> #  ** Avg. Space in MBYTES per working proc during facto    :         1
>
> # srun: Job step aborted: Waiting up to 32 seconds for job step to finish.
>
> # slurmstepd: error: *** JOB 104 ON c3 CANCELLED AT 2016-10-08T19:51:29
> DUE TO TIME LIMIT ***
>
> # slurmstepd: error: *** STEP 104.0 ON c3 CANCELLED AT 2016-10-08T19:51:29
> DUE TO TIME LIMIT ***
>
>
>
> Not sure yet why I’m not completing the Factorization and getting the next
> message:
>
> ELAPSED TIME FOR FACTORIZATION           =      0.0013
>
>
>
> Thoughts?
>
> John
>
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20161017/dd3c4e44/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: psm.patch
Type: application/octet-stream
Size: 738 bytes
Desc: not available
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20161017/dd3c4e44/attachment-0001.obj>


More information about the mvapich-discuss mailing list