[mvapich-discuss] FW: Possible bug: Segmentation fault with MPI_Reduce and MPI_IN_PLACE

Alexander Alekhin alexander.alekhin at itseez.com
Wed Nov 24 03:55:44 EST 2010


Hi Krishna,

 

I launched job with MV2_USE_SHMEM_REDUCE=0 flag and job was finished
successfully. I assume that this flag provides some performance degradation
on SMP systems.

 

--

Thanks,

Alexander Alekhin

 

From: krishna.kandalla at gmail.com [mailto:krishna.kandalla at gmail.com] On
Behalf Of Krishna Kandalla
Sent: Monday, November 22, 2010 10:28 PM
To: Alexander Alekhin
Cc: mvapich-discuss at cse.ohio-state.edu
Subject: Re: [mvapich-discuss] FW: Possible bug: Segmentation fault with
MPI_Reduce and MPI_IN_PLACE

 

Hi Alexander, 

                  Thank you for reporting this error. Can you please try
running your application by setting the MV2_USE_SHMEM_REDUCE flag to 0. You
can find more information about this run-time variable at : 

http://mvapich.cse.ohio-state.edu/support/user_guide_mvapich2-1.6rc1.html

 

Thanks,

Krishna

On Mon, Nov 22, 2010 at 11:06 AM, Alexander Alekhin
<alexander.alekhin at itseez.com> wrote:

Hi,

 

I use MVAPICH2 1.5.1p1 code from svn.
My mpiname info:
MVAPICH2 1.5.1p1 Unofficial Build ch3:mrail

 

Compilation

CC: gcc  -g

CXX: c++  -g

F77: g77  -g

F90: f95  -g

 

Configuration

--prefix=$HOME/mvapich2/install --enable-g=all --enable-error-messages=all
--enable-fast=none

 

My problem is in application failure which uses MPI_IN_PLACE with MPI_Reduce
operation.

 

For example I provide this part of code which generate a segmentation fault
with 2 processes launched on same node:

 

// myid is rank in MPI_COMM_WORLD (see examples/cpi.c)
{
    MPI_Group g1, g2;
    MPI_Comm comm;
    int ranks[2] = { 1, 0 };
    MPI_Comm_group(MPI_COMM_WORLD, &g1);
    MPI_Group_incl(g1, 2, ranks, &g2);
 
    MPI_Comm_create(MPI_COMM_WORLD, g2, &comm);
 
    if (myid == 0) { // rank 1 of comm (root of Reduce)
        int result = myid;
        if (MPI_Reduce(MPI_IN_PLACE, &result, 1, MPI_INT, MPI_SUM, 1, comm)
!= MPI_SUCCESS) // fail is here
            exit(1);
    } else {
        if (MPI_Reduce(&myid, NULL, 1, MPI_INT, MPI_SUM, 1, comm) !=
MPI_SUCCESS)
            exit(1);
    }
 
    MPI_Comm_free(&comm);
    MPI_Group_free(&g2);
    MPI_Group_free(&g1);
}

 

Command to launch:

mpiexec -np 2 -host <host_name> <binary_file>

 

GDB info:

0:  Program received signal SIGSEGV, Segmentation fault.
0:  0x0000000000411a73 in MPIUI_Memcpy (dst=0x2aaaaadcfc0c,
0:      src=0xffffffffffffffff, len=4) at ../../include/mpiimpl.h:122
0:  122     memcpy(dst, src, len);
0: (gdb) bt
0:  #0  0x0000000000411a73 in MPIUI_Memcpy (dst=0x2aaaaadcfc0c,
0:      src=0xffffffffffffffff, len=4) at ../../include/mpiimpl.h:122
0:  #1  0x0000000000414eb3 in MPIR_Localcopy (sendbuf=0xffffffffffffffff,
0:      sendcount=1, sendtype=1275069445, recvbuf=0x2aaaaadcfc0c,
recvcount=1,
0:      recvtype=1275069445) at helper_fns.c:335
0:  #2  0x0000000000410d48 in PMPI_Reduce (sendbuf=0xffffffffffffffff,
0:      recvbuf=0x7fffcb044bcc, count=1, datatype=1275069445, op=1476395011,
0:      root=1, comm=-1006632960) at reduce_osu.c:1017

 

If I replace MPI_IN_PLACE to variable then all works fine.

 

Can somebody check this problem?

 

--

Thanks,

Alexander Alekhin

 


_______________________________________________
mvapich-discuss mailing list
mvapich-discuss at cse.ohio-state.edu
http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20101124/576f8f36/attachment-0001.html


More information about the mvapich-discuss mailing list