[mvapich-discuss] Help problem MPI_Bcast fails on np=8 with 8MB
buffer
Terrence.LIAO at total.com
Terrence.LIAO at total.com
Thu Aug 21 17:55:56 EDT 2008
Dear mvapich,
I got a core dump when MPI_Bcast(buffer, n, MPI_DOUBLE,...) when n is
1024*1024, i,e 8MB buffer on np=8 on 8 compute nodes. I have NO
problem when using np = 7. I am using mvapich-1.0 Feb 28 2008 download on
AMD cluster - quad-core dual sockets 16GB mem, with 4xDDR IB. mvapich is
built on pgi 7.1 compiler. Below is the gdb output. Any suggestion I
should do to fix this problem? Thank you very much. -- Terrence
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 182894245856 (LWP 18383)]
0x00000036d80723e3 in memcpy () from /lib64/tls/libc.so.6
(gdb) where
#0 0x00000036d80723e3 in memcpy () from /lib64/tls/libc.so.6
#1 0x0000000000449c09 in MPID_VIA_self_start (buf=0x2a96546010,
len=8388608, src_lrank=0, tag=2,
context_id=0, shandle=0x57a1e8) at viasend.c:276
#2 0x000000000044c205 in MPID_IsendContig (comm_ptr=0x5a2060,
buf=0x2a96546010, len=8388608,
src_lrank=0, tag=2, context_id=0, dest_grank=0,
msgrep=MPID_MSGREP_RECEIVER, request=0x57a1e8,
error_code=0x7fbfffe66c) at mpid_send.c:84
#3 0x0000000000435cfd in MPID_IsendDatatype (comm_ptr=0x5a2060,
buf=0x2a96546010, count=1048576,
dtype_ptr=0x56ac60, src_lrank=0, tag=2, context_id=0, dest_grank=0,
request=0x57a1e8,
error_code=0x7fbfffe66c) at mpid_hsend.c:129
#4 0x0000000000443215 in PMPI_Isend (buf=0x2a96546010, count=1048576,
datatype=11, dest=0, tag=2,
comm=91, request=0x7fbfffe710) at isend.c:97
#5 0x0000000000444710 in PMPI_Sendrecv (sendbuf=0x2a96546010,
sendcount=1048576, sendtype=11,
dest=0, sendtag=2, recvbuf=0x2a96d4bc00, recvcount=1048576,
recvtype=11, source=0, recvtag=2,
comm=91, status=0x7fbfffe820) at sendrecv.c:95
#6 0x000000000041c355 in intra_shmem_Bcast_Large (buffer=0x2a96546010,
count=1048576,
datatype=0x56ac60, nbytes=8388608, root=0, comm=0x5a2060) at
intra_fns_new.c:1704
#7 0x000000000041b6b4 in intra_Bcast_Large (buffer=0x2a96546010,
count=1048576, datatype=0x56ac60,
nbytes=8388608, root=0, comm=0x5a2060) at intra_fns_new.c:1309
#8 0x000000000041b157 in intra_newBcast (buffer=0x2a96546010,
count=1048576, datatype=0x56ac60,
root=0, comm=0x5a2060) at intra_fns_new.c:1117
#9 0x0000000000412008 in PMPI_Bcast (buffer=0x2a96546010, count=1048576,
datatype=11, root=0,
comm=91) at bcast.c:122
#10 0x00000000004042de in main (argc=2, argv=0x7fbfffee98) at
large-mpi_bcast_test.c:159
(gdb)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20080821/706f5f42/attachment-0001.html
More information about the mvapich-discuss
mailing list