[mvapich-discuss] MVAPICH2 Invalid communicator errors
Dhabaleswar Panda
panda at cse.ohio-state.edu
Wed Sep 12 23:07:38 EDT 2007
Mark,
Sorry to know that you are facing this problem. We have used MVAPICH2
0.9.8 with Intel 9.1 compiler and do not see any problem. In fact, as
a part of our regular testing, we test the code with all four
compilers - gcc, Intel, pathscale and pgi. It appears to be some
set-up issues.
May we know which version of MVAPICH2 0.9.8 you are using (from OFED
or from MVAPICH page).
Thanks,
DK
> The MVAPICH2 was built with Intel 9.1-038 and I attempted to build
> and run the app with the identical compiler.
>
> I use modules to properly set PATH, LD_LIBRARY_PATH, MAN_PATH,
> LIBRARY_PATH, etc. for MVAPICH2 to avoid just some of the problems
> you describe.
>
> The simple test codes are all pure C, no Fortran.
>
> My description of building and runing with gcc compilers was to
> indicate that I had most of the MVAPICH2 things working, and even
> the same test pgms. worked when using the gcc compilers.
>
> Has anyone had similar problems when building and running
> MVAPICH2-0.9.8 with Intel compilers.
>
> regards,
>
> Tom Mitchell wrote:
> > On Sep 12 03:41, Mark Potts wrote:
> >> Hi,
> >> Taking my first steps with MVAPICH2-0.9.8 from OFED 1.2.
> >>
> >> I am able to build and run apps using gcc to build MVAPICH2 and apps.
> >>
> >> I am also (apparently) successfully building MVAPICH2 and simple C
> >> code apps. with an icc Intel 9.1 compiler. However, any
> >> built-with-Intel app that I attempt to run fails with an output
> >> similar to that shown at the bottom. The same apps build and run
> >> quite nicely using gcc. The failure occurs at the very beginning
> >> of the pgm, after:
> >
> > Are you compiling MVAPICH2 with one compiler and the application
> > with another? In general you do not want to do this.
> >
> > When building any MPI the helper scripts (mpicc and friends)
> > have specific knowledge of the compiler and libs. After
> > building the binary, it can be important to ensure that paths
> > (both PATH and LD_LIBRARY_PATH etc. (csh path .vs. PATH too))
> > find all the correct bits first. These environment variables
> > are important both on the command line and also on compute
> > nodes including localhost.
> >
> > When you get to Fortran there is a risk of subtle binary
> > incompatibility issues to confound you if you are mixing
> > compilers and other common libs. So always start with
> > all the bits being compiled with a single compiler suite.
> >
> >
> >
> >
> >> MPI_Init (&argc, &argv);
> >>
> >> and before the completion of the next statement:
> >>
> >> MPI_Comm_rank (MPI_COMM_WORLD, &rank);
> >>
> >> Since we need both gcc and Intel versions of MVAPICH2, I would
> >> appreciate any help understanding what is going wrong during
> >> startup of these apps.
> >>
> >> The output from mpich2version is:
> >> Version: MVAPICH2-0.9.8
> >> Device: osu_ch3:mrail
> >> Configure Options: --prefix=/usr/mpi/mvapich2-0.9.8-12/intel
> >> --with-device=osu_ch3:mrail --with-rdma=gen2 --with-pm=mpd
> >> --enable-romio --enable-sharedlibs=gcc --without-mpe
> >>
> >>
> >> Typical failed mpiexec output:
> >>
> >> mpiexec -machinefile hostdews -n 3 ./mpisanity_mv2
> >> Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
> >> MPI_Comm_rank(105): MPI_Comm_rank(comm=0x1, rank=0x7fff4fb01cd8) failed
> >> MPI_Comm_rank(64).: Invalid communicatorrank 2 in job 1 dewberry4_32925
> >> caused collective abort of all ranks
> >> exit status of rank 2: killed by signal 9
> >> Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
> >> MPI_Comm_rank(105): MPI_Comm_rank(comm=0x1, rank=0x7fffddc500c8) failed
> >> MPI_Comm_rank(64).: Invalid communicatorrank 1 in job 1 dewberry4_32925
> >> caused collective abort of all ranks
> >> exit status of rank 1: killed by signal 9
> >> Fatal error in MPI_Comm_rank: Invalid communicator, error stack:
> >> MPI_Comm_rank(105): MPI_Comm_rank(comm=0x1, rank=0x7fff13581138) failed
> >> MPI_Comm_rank(64).: Invalid communicatorrank 0 in job 1 dewberry4_32925
> >> caused collective abort of all ranks
> >> exit status of rank 0: killed by signal 9
> >>
> >> regards,
> >> --
> >> ***********************************
> >>>> Mark J. Potts, PhD
> >>>>
> >>>> HPC Applications Inc.
> >>>> phone: 410-992-8360 Bus
> >>>> 410-313-9318 Home
> >>>> 443-418-4375 Cell
> >>>> email: potts at hpcapplications.com
> >>>> potts at excray.com
> >> ***********************************
> >> _______________________________________________
> >> mvapich-discuss mailing list
> >> mvapich-discuss at cse.ohio-state.edu
> >> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >
>
> --
> ***********************************
> >> Mark J. Potts, PhD
> >>
> >> HPC Applications Inc.
> >> phone: 410-992-8360 Bus
> >> 410-313-9318 Home
> >> 443-418-4375 Cell
> >> email: potts at hpcapplications.com
> >> potts at excray.com
> ***********************************
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
More information about the mvapich-discuss
mailing list