[mvapich-discuss] MPI communication problem with mvapich2-1.8a1p1

Jonathan Perkins perkinjo at cse.ohio-state.edu
Fri Jan 27 17:38:55 EST 2012


Please try the following...
./configure --prefix=/usr/local/mvapich2-1.8a1p1-gcc --enable-fast
--enable-f77 --enable-fc --enable-cxx --enable-romio --enable-mpe

If you would like to try and provide stack traces to us use...
./configure --prefix=/usr/local/mvapich2-1.8a1p1-gcc --disable-fast
--enable-g=dbg --enable-f77 --enable-fc --enable-cxx --enable-romio
--enable-mpe

On Fri, Jan 27, 2012 at 5:31 PM, Nirmal Seenu <nirmal at fnal.gov> wrote:
> Hi,
>
> I doubt that the options used to build MVAPICH2 is the problem here as the
> remote MPI process launch successfully and they do a little bit of
> communication before they hang.
>
> I use the same options to build the version mvapich2-1.2p1, mvapich2-1.5,
> mvapich2-1.6rc2 and mvapich2-1.6-r4751 and they all work fine.
>
> What options do I need on MVAPICH2 build to use mpiexec launcher to use TM
> interface to launch MPI jobs?
>
> Nirmal
>
>
> On 01/27/2012 03:53 PM, Jonathan Perkins wrote:
>>
>> Hello Nirmal, sorry to hear that you're having trouble.  Let me
>> suggest that you remove some of the options that you've specified at
>> the configure step.  We no longer support MPD so you should remove the
>> --enable-pmiport and --with-pm=mpd options.  I actually think it'll be
>> simpler for you to remove more options and then only add an option if
>> you need it and things are working.
>>
>> Please try the following configuration for MVAPICH2 and let us know if
>> you still have trouble or not.
>> ./configure --prefix=/usr/local/mvapich2-1.8a1p1-gcc --enable-fast
>> --enable-f77 --enable-fc --enable-cxx --enable-romio --enable-mpe
>>
>> On Fri, Jan 27, 2012 at 3:57 PM, Nirmal Seenu<nirmal at fnal.gov>  wrote:
>>>
>>> I am having trouble running the Intel MPI Benchmark(IMB_3.2.3 where I run
>>> IMB-MPI1 without any options) on the latest version of MVAPICH2-1.8a1p1.
>>>
>>> The MPI process gets launched properly on the worker nodes but the
>>> benchmark
>>> hangs within a few seconds after the launch and doesn't make any
>>> progress. I
>>> checked the infiniband fabric and everything is healthy. We mount Lustre
>>> over native IB on all the worker nodes and the lustre mounts are healthy
>>> as
>>> well.
>>>
>>> This reproducible on MVAPICH2 compiled with GCC and PGI compiler 11.7 as
>>> well.
>>>
>>> Details about the installation:
>>>
>>> The worker nodes run RHEL 5.3 with the latest kernel 2.6.18-274.17.1.el5
>>> and
>>> we use the Infiniband drivers that are distributed as a part of the
>>> kernel.
>>>
>>> MVAPICH2 gcc version was compiled with the following compiler:
>>> gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-50)
>>>
>>> The following were the options used to compile the MVAPICH2 and the
>>> MPIEXEC:
>>>
>>> export CC=gcc
>>> export CXX=g++
>>> export F77=gfortran
>>> export FC=gfortran
>>>
>>> export CFLAGS=-mcmodel=medium
>>> export CXXFLAGS=-mcmodel=medium
>>> export FFLAGS=-mcmodel=medium
>>> export FCFLAGS=-mcmodel=medium
>>> export LDFLAGS=-mcmodel=medium
>>>
>>> MVAPICH2:
>>> ./configure --prefix=/usr/local/mvapich2-1.8a1p1-gcc --enable-fast
>>> --enable-f77 --enable-fc --enable-cxx --enable-romio --enable-pmiport
>>> --enable-mpe --with-pm=mpd --with-pmi=simple --with-thread-package
>>> --with-hwloc
>>>
>>> MPIEXEC:
>>> ./configure --prefix=/usr/local/mvapich2-1.8a1p1-gcc
>>> --with-pbs=/usr/local/pbs
>>> --with-mpicc=/usr/local/mvapich2-1.8a1p1-gcc/bin/mpicc
>>> --with-mpicxx=/usr/local/mvapich2-1.8a1p1-gcc/bin/mpicxx
>>> --with-mpif77=/usr/local/mvapich2-1.8a1p1-gcc/bin/mpif77
>>> --with-mpif90=/usr/local/mvapich2-1.8a1p1-gcc/bin/mpif90
>>> --disable-mpich-gm
>>> --disable-mpich-p4 --disable-mpich-rai --with-default-comm=pmi
>>>
>>> I was able to run the Intel MPI Benchmark using the following versions of
>>> MVAPICH2 that was compiled with the same version of gcc:
>>> mvapich2-1.2p1
>>> mvapich2-1.5
>>> mvapich2-1.6rc2
>>> mvapich2-1.6-r4751
>>>
>>> I will be more than happy to provide more details if needed. Thanks in
>>> advance for looking into this problem.
>>>
>>> Nirmal
>>> _______________________________________________
>>> mvapich-discuss mailing list
>>> mvapich-discuss at cse.ohio-state.edu
>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>
>>
>>
>>
>



-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo



More information about the mvapich-discuss mailing list