[mvapich-discuss] Segmentation fault at some MPI functions after MPI_Put
Akihiro Tabuchi
tabuchi at hpcs.cs.tsukuba.ac.jp
Wed Nov 4 01:47:43 EST 2015
Dear Khaled and Jiri
Thank you for your reply.
I forgot to write that I set MV2_USE_GPUDIRECT_GDRCOPY=0 because GDRCOPY
for CUDA7.5 is not installed in the cluster.
osu_put_latency was passed but the result is unreasonable when
MV2_CUDA_IPC=1.
when MV2_CUDA_IPC=1
("mpirun_rsh -np 2 -hostfile $PBS_NODEFILE MV2_NUM_PORTS=2
MV2_USE_CUDA=1 MV2_CUDA_IPC=1 MV2_USE_GPUDIRECT_GDRCOPY=0
./local_rank.sh osu_put_latency -d cuda -w create -s lock D D")
(local_rank.sh is used for setting LOCAL_RANK=$MV2_COMM_WORLD_LOCAL_RANK
for GPU selection)
###################################################################################################
# OSU MPI_Put-CUDA Latency Test v5.0
# Window creation: MPI_Win_create
# Synchronization: MPI_Win_lock/unlock
# Rank 0 Memory on DEVICE (D) and Rank 1 Memory on DEVICE (D)
# Size Latency (us)
0 0.05
1 3.30
2 3.29
4 3.29
8 3.30
16 3.31
32 3.30
64 3.26
128 3.30
256 3.29
512 3.27
1024 3.34
2048 3.25
4096 3.26
8192 3.46
16384 3.25
32768 3.21
65536 3.18
131072 3.33
262144 3.22
524288 3.14
1048576 3.20
2097152 3.17
4194304 3.21
###################################################################################################
when MV2_CUDA_IPC=0
###################################################################################################
# OSU MPI_Put-CUDA Latency Test v5.0
# Window creation: MPI_Win_create
# Synchronization: MPI_Win_lock/unlock
# Rank 0 Memory on DEVICE (D) and Rank 1 Memory on DEVICE (D)
# Size Latency (us)
0 0.05
1 4.41
2 4.40
4 4.41
8 4.41
16 4.40
32 4.41
64 4.48
128 4.80
256 5.38
512 6.47
1024 8.94
2048 13.58
4096 21.33
8192 36.63
16384 38.95
32768 55.44
65536 82.53
131072 65.37
262144 94.06
524288 143.40
1048576 252.99
2097152 493.56
4194304 976.52
###################################################################################################
nvidia-smi topo -m
###################################################################################################
^[[4mGPU0 GPU1 GPU2 GPU3 mlx4_0 CPU Affinity^[[0m
GPU0 X PHB SOC SOC SOC 0-9
GPU1 PHB X SOC SOC SOC 0-9
GPU2 SOC SOC X PHB PHB 10-19
GPU3 SOC SOC PHB X PHB 10-19
mlx4_0 SOC SOC PHB PHB X
Legend:
X = Self
SOC = Path traverses a socket-level link (e.g. QPI)
PHB = Path traverses a PCIe host bridge
PXB = Path traverses multiple PCIe internal switches
PIX = Path traverses a PCIe internal switch
###################################################################################################
The system configuration is the below.
######################################
CPU: Intel Xeon-E5 2680v2 x 2socket
GPU: NVIDIA K20X x 4
IB: Mellanox Connect-X3 Dual-port QDR
######################################
Best regards,
Akihiro Tabuchi
On 2015年11月04日 06:18, Jiri Kraus wrote:
> Hi Akihiro,
>
> can you provide the output of
>
> $ nvidia-smi topo -m
>
> on the machine were this happens?
>
> Thanks
>
> Jiri
>
> Sent from my smartphone. Please excuse autocorrect typos.
>
>
>
> ---- Akihiro Tabuchi schrieb ----
>
> Dear MVAPICH developers,
>
> I use MVAPICH2-GDR 2.1 on a GPU cluster which has four GPUs on each node.
> When the following conditions, MPI_Win_Free or MPI_Barrier cause a
> segmentation fault after MPI_Put to a GPU on another MPI process in the
> same node.
> 1. synchronization by MPI_Win_lock and MPI_Win_unlock
> 2. (128*N)KB < (MPI_Put transfer size) <= (128*N+8)KB, (N >= 1)
> 3-a. When MV2_CUDA_IPC=1, the number of processes in a node is three
> and over.
> 3-b. When MV2_CUDA_IPC=0, the number of processes in a node is two and
> over.
>
> A test program and a backtrace of it are the below.
>
> A test program
> ###################################################################################################
> #include <stdio.h>
> #include <stdlib.h>
> #include <mpi.h>
> #include <cuda_runtime.h>
> #define MAXSIZE (4*1024*1024)
>
> int main(int argc, char **argv){
> MPI_Init(&argc, &argv);
>
> if(argc != 2){
> printf("few arguments\n");
> return 1;
> }
> int size = atoi(argv[1]);
> if(size > MAXSIZE){
> printf("too large size\n");
> return 1;
> }
> int rank, nranks;
> MPI_Comm_rank(MPI_COMM_WORLD, &rank);
> MPI_Comm_size(MPI_COMM_WORLD, &nranks);
>
> if(nranks < 2){
> printf("# of processes must be more than 1\n");
> return 1;
> }
> if(rank == 0){
> printf("put size=%d\n", size);
> }
>
> char *buf;
> cudaMalloc((void**)&buf, MAXSIZE*sizeof(char));
> MPI_Win win;
> MPI_Win_create((void*)buf, MAXSIZE*sizeof(char), sizeof(char),
> MPI_INFO_NULL, MPI_COMM_WORLD, &win);
>
> if(rank == 0){
> int target_rank = 1;
> MPI_Win_lock(MPI_LOCK_SHARED, target_rank, 0, win);
> MPI_Put((void*)buf, size, MPI_BYTE, target_rank, 0, size, MPI_BYTE,
> win);
> MPI_Win_unlock(target_rank, win);
> }
>
> //MPI_Barrier(MPI_COMM_WORLD);
> MPI_Win_free(&win);
> cudaFree(buf);
> MPI_Finalize();
> return 0;
> }
> ###################################################################################################
>
>
> A backtrace when the program was run by
> "mpirun_rsh -np 3 -hostfile $PBS_NODEFILE MV2_NUM_PORTS=2 MV2_USE_CUDA=1
> MV2_CUDA_IPC=1 ./put_test 131073"
> (three processes are run on same node)
> ###################################################################################################
> [tcag-0001:mpi_rank_1][error_sighandler] Caught error: Segmentation
> fault (signal 11)
> [tcag-0001:mpi_rank_1][print_backtrace] 0:
> /work/XMPTCA/tabuchi/local/opt/mvapich2/gdr/2.1/cuda7.5/gnu/lib64/libmpi.so.12(print_backtrace+0x23)
> [0x2b49628c7753]
> [tcag-0001:mpi_rank_1][print_backtrace] 1:
> /work/XMPTCA/tabuchi/local/opt/mvapich2/gdr/2.1/cuda7.5/gnu/lib64/libmpi.so.12(error_sighandler+0x5e)
> [0x2b49628c786e]
> [tcag-0001:mpi_rank_1][print_backtrace] 2: /lib64/libc.so.6(+0x326b0)
> [0x2b4962c7b6b0]
> [tcag-0001:mpi_rank_1][print_backtrace] 3:
> /work/XMPTCA/tabuchi/local/opt/mvapich2/gdr/2.1/cuda7.5/gnu/lib64/libmpi.so.12(dreg_decr_refcount+0x27)
> [0x2b4962888447]
> [tcag-0001:mpi_rank_1][print_backtrace] 4:
> /work/XMPTCA/tabuchi/local/opt/mvapich2/gdr/2.1/cuda7.5/gnu/lib64/libmpi.so.12(dreg_unregister+0x11)
> [0x2b4962888a61]
> [tcag-0001:mpi_rank_1][print_backtrace] 5:
> /work/XMPTCA/tabuchi/local/opt/mvapich2/gdr/2.1/cuda7.5/gnu/lib64/libmpi.so.12(MPIDI_CH3I_MRAILI_self_cq_poll+0x143)
> [0x2b4962895973]
> [tcag-0001:mpi_rank_1][print_backtrace] 6:
> /work/XMPTCA/tabuchi/local/opt/mvapich2/gdr/2.1/cuda7.5/gnu/lib64/libmpi.so.12(MPIDI_CH3I_Progress+0x337)
> [0x2b4962866117]
> [tcag-0001:mpi_rank_1][print_backtrace] 7:
> /work/XMPTCA/tabuchi/local/opt/mvapich2/gdr/2.1/cuda7.5/gnu/lib64/libmpi.so.12(MPIC_Wait+0x47)
> [0x2b496280bad7]
> [tcag-0001:mpi_rank_1][print_backtrace] 8:
> /work/XMPTCA/tabuchi/local/opt/mvapich2/gdr/2.1/cuda7.5/gnu/lib64/libmpi.so.12(MPIC_Recv+0xb7)
> [0x2b496280c737]
> [tcag-0001:mpi_rank_1][print_backtrace] 9:
> /work/XMPTCA/tabuchi/local/opt/mvapich2/gdr/2.1/cuda7.5/gnu/lib64/libmpi.so.12(MPIR_Reduce_scatter_block_intra+0x1fc8)
> [0x2b49625e38d8]
> [tcag-0001:mpi_rank_1][print_backtrace] 10:
> /work/XMPTCA/tabuchi/local/opt/mvapich2/gdr/2.1/cuda7.5/gnu/lib64/libmpi.so.12(MPIR_Reduce_scatter_block_impl+0x4a)
> [0x2b49625e3d3a]
> [tcag-0001:mpi_rank_1][print_backtrace] 11:
> /work/XMPTCA/tabuchi/local/opt/mvapich2/gdr/2.1/cuda7.5/gnu/lib64/libmpi.so.12(MPIDI_Win_free+0x25e)
> [0x2b496283ebfe]
> [tcag-0001:mpi_rank_1][print_backtrace] 12:
> /work/XMPTCA/tabuchi/local/opt/mvapich2/gdr/2.1/cuda7.5/gnu/lib64/libmpi.so.12(MPI_Win_free+0x23a)
> [0x2b49627ec62a]
> [tcag-0001:mpi_rank_1][print_backtrace] 13: ./put_test() [0x400ac8]
> [tcag-0001:mpi_rank_1][print_backtrace] 14:
> /lib64/libc.so.6(__libc_start_main+0xfd) [0x2b4962c67d5d]
> [tcag-0001:mpi_rank_1][print_backtrace] 15: ./put_test() [0x400919]
> [tcag-0001:mpispawn_0][readline] Unexpected End-Of-File on file
> descriptor 6. MPI process died?
> [tcag-0001:mpispawn_0][mtpmi_processops] Error while reading PMI socket.
> MPI process died?
> [tcag-0001:mpispawn_0][child_handler] MPI process (rank: 1, pid: 25550)
> terminated with signal 11 -> abort job
> [tcag-0001:mpirun_rsh][process_mpispawn_connection] mpispawn_0 from node
> tcag-0001 aborted: Error while reading a PMI socket (4)
> ###################################################################################################
>
>
> Do you know the cause of this problem?
>
> Best regards,
> Akihiro Tabuchi
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
> -----------------------------------------------------------------------------------
> NVIDIA GmbH
> Wuerselen
> Amtsgericht Aachen
> HRB 8361
> Managing Director: Karen Theresa Burns
>
> -----------------------------------------------------------------------------------
> This email message is for the sole use of the intended recipient(s) and
> may contain
> confidential information. Any unauthorized review, use, disclosure or
> distribution
> is prohibited. If you are not the intended recipient, please contact
> the sender by
> reply email and destroy all copies of the original message.
> -----------------------------------------------------------------------------------
>
--
Akihiro Tabuchi
tabuchi at hpcs.cs.tsukuba.ac.jp
More information about the mvapich-discuss
mailing list