[mvapich-discuss] perhaps odd behavior..

Steve Heistand steve.heistand at nasa.gov
Fri Jan 9 15:02:07 EST 2015


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

so we have the latest mvapich build:

MVAPICH2 2.1rc1 Thu Dec 18 20:00:00 EDT 2014 ch3:mrail

Compilation
CC: icc -fpic -m64   -DNDEBUG -DNVALGRIND -O2
CXX: icpc -fpic -m64  -DNDEBUG -DNVALGRIND -O2
F77: ifort -L/lib -L/lib -m64 -fpic  -O2
FC: ifort -m64 -fpic  -O2

Configuration
- --with-device=ch3:mrail --with-rdma=gen2 CC=icc CXX=icpc F77=ifort FC=ifort CFLAGS=-fpic
- -m64 CXXFLAGS=-fpic -m64 FFLAGS=-m64 -fpic FCFLAGS=-m64 -fpic --enable-f77 --enable-fc
- --enable-cxx --enable-romio --enable-threads=default --with-hwloc -disable-multi-aliases
- -enable-xrc=no -enable-hybrid --prefix=XXX --with-file-system=lustre

it was compiled on and for the most part run on machines that have 1 IB card with dual
ports. This is all fine so far.
However when we run on a system that has dual cards each with a single port the job dies
at startup.

If I tell it that the system is dual hca single port via environment variables it runs fine.

Im at this point unsure if it actually uses both ports on either configuration.

I would have thought it would have probed the hardware to figure out what set up
it had when it tried to bond to the multiple ports.

unless its actually crashing in the probe section of the mpi_init routines...

thoughts?

thanks

s


- -- 
************************************************************************
 Steve Heistand                          NASA Ames Research Center
 Email: steve.heistand at nasa.gov          Steve Heistand/Mail Stop 258-6
 Work Phone: (650) 604-4369              Bldg. 258, Rm. 232-5
 Scientific & HPC Application            P.O. Box 1
 Development/Optimization                Moffett Field, CA 94035-0001
************************************************************************
 "Any opinions expressed are those of our alien overlords, not my own."

# For Remedy                        #
#Action: Resolve                    #	
#Resolution: Resolved               #
#Reason: No Further Action Required #
#Tier1:	User Code                   #
#Tier2:	Other                       #
#Tier3:	Assistance                  #
#Notification: None                 #
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)

iEYEARECAAYFAlSwM78ACgkQoBCTJSAkVrGcowCfbb4olsTD75zUTpAUbU/RRXlI
vPUAn06naxXOaR6ICj2YPSNoyIYKlqxy
=fWil
-----END PGP SIGNATURE-----


More information about the mvapich-discuss mailing list