[mvapich-discuss] invalid communicator

Dhabaleswar Panda panda at cse.ohio-state.edu
Wed Nov 12 00:10:18 EST 2008


Dan - The reported error was because of some incorrect datatype format
being used by the application. The user code was failing with MVAPICH2
1.2RC2 and also with MPICH2. After taking care of the datatype formating
issue, the application was able to successfully run with MVAPICH2 1.2RC2.
Here is a short explanation on this issue, as analyzed by ANL researchers.

>The problem is that in Allgather, the contribution from rank 0 is stored
>at outbuf, the contribution from rank 1 is stored at outbuf + recvcount *
>extent(recvtype), the contribution from rank 2 is storead at outbuf + 2 *
>recvcount * extent(recvtype), and so on. He has neglected to take the
>extent of the recvtype into account, and hence expects the data to be
>placed elsewhere. Ask him to look at the definition of MPI_Gather, which
>explains how the received data is placed. I didn't check his test
>program, but I suspect it writes outside the allocated buffer as a
>result.

Please check your application to see if this situation is arising or not.
You can also verify your application with the standard MPICH or MPICH2
stack with the TCP/IP interface (over Ethernet). This will eliminate any
IB-related issues.

Thanks,

DK

On Tue, 11 Nov 2008, Dan Kokron wrote:

> I am seeing the following error message while running under mvapich-1.1rc1.
>
> 151 - MPI_SCATTERV : Communicator argument is not a valid communicator
> Special bit pattern 37c00000 in communicator is incorrect.  May indicate an
> out-of-order argument or a freed communicator
>
> I noticed that a similar issue was raised in this forum in July and was followed up in Sept. with
> http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/2008-September/001916.html
>
> The followup does not indicate the coding error that was fixed.  What
> coding error should I be looking for in my code?  Any other suggestions
> to try?
>
> mpif90 -show
> ln -s /home/dkokron/play/mvapich-1.1rc1/include/mpif.h mpif.h
> /usr/local/intel/comp/9.1.052/bin/ifort -L/usr/lib64
> -L/home/dkokron/play/mvapich-1.1rc1/lib -lmpichf90nc -lmpich
> -L/usr/lib64 -Wl,-rpath=/usr/lib64 -libverbs -libumad -lpthread
> -lpthread -lrt
>
>
> --
> Dan Kokron
> Global Modeling and Assimilation Office
> NASA Goddard Space Flight Center
> Greenbelt, MD 20771
> Daniel.S.Kokron at nasa.gov
> Phone: (301) 614-5192
> Fax:   (301) 614-5304
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>



More information about the mvapich-discuss mailing list