[mvapich-discuss] MVAPICH2 on aarch64
Sreenidhi Bharathkar Ramesh
sreenira at broadcom.com
Thu Dec 17 23:34:01 EST 2015
Hello,
I am trying the MVAPICH2 on aarch64, specifically Juno evaluation board and am observing run time errors. Please note that this is a single node.
Also, please note that same procedure is flawless on x86 platform.
1. For compilation, I had to include the following patch:
src/mpid/ch3/channels/common/include/mv2_clock.h
+#elif defined(__aarch64__)
+typedef unsigned long cycles_t;
+static inline cycles_t get_cycles()
+{
+ cycles_t ret;
+
+ asm volatile ("isb" : : : "memory");
+ asm volatile ("mrs %0, cntvct_el0" : "=r" (ret));
+
+ return ret;
+}
2. While executing OSD benchmarks, one of the following errors was seen, every time.
a> test hangs
b> seg fault observed
3. Questions:
a> Has MVAPICH been tried on aarch64 platform ? Am I missing any code delta, apart from above, in point #1 ?
b> what could be the root cause for the error ?
Please let me know.
Thanks,
- Sreenidhi.
----------
Error appendix:
sreenira at berlin:osu_benchmarks$ mpirun --version
HYDRA build details:
Version: 3.1.4
Release Date: Thu Apr 2 17:15:15 EDT 2015
CC: gcc
CXX: g++
F77: gfortran
F90: gfortran
Configure options: '--disable-option-checking' '--prefix=/home/sreenira/install-mvapich2' '--with-ib-libpath=/home/sreenira/install-libibverbs/lib' '--with-ib-include=/home/sreenira/install-libibverbs/include' '--disable-mcast' '--cache-file=/dev/null' '--srcdir=.' 'CC=gcc' 'CFLAGS= -DNDEBUG -DNVALGRIND -O2' 'LDFLAGS=-L/home/sreenira/install-libibverbs/lib -L/lib -L/lib -L/lib -Wl,-rpath,/lib -L/lib -Wl,-rpath,/lib -L/home/sreenira/install-libibverbs/lib -L/lib -L/lib' 'LIBS=-libverbs -ldl -lrt -lm -lpthread ' 'CPPFLAGS=-I/home/sreenira/install-libibverbs/include -I/home/sreenira/hpcc/mvapich2-2.1/src/mpl/include -I/home/sreenira/hpcc/mvapich2-2.1/src/mpl/include -I/home/sreenira/hpcc/mvapich2-2.1/src/openpa/src -I/home/sreenira/hpcc/mvapich2-2.1/src/openpa/src -D_REENTRANT -I/home/sreenira/hpcc/mvapich2-2.1/src/mpi/romio/include -I/include -I/include -I/home/sreenira/install-libibverbs/include -I/include -I/include'
Process Manager: pmi
Launchers available: ssh rsh fork slurm ll lsf sge manual persist
Topology libraries available: hwloc
Resource management kernels available: user slurm ll lsf sge pbs cobalt
Checkpointing libraries available:
Demux engines available: poll select
sreenira at berlin:osu_benchmarks$ mpirun -np 2 ./osu_bw
# OSU MPI Bandwidth Test
# Size Bandwidth (MB/s)
1 0.39
2 0.78
4 1.58
8 3.16
16 6.34
32 12.61
64 24.92
128 49.09
256 95.43
512 179.93
1024 323.46
2048 560.85
4096 898.34
8192 1299.09
[berlin:mpi_rank_0][error_sighandler] Caught error: Segmentation fault (signal 11)
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 673 RUNNING AT berlin
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
sreenira at berlin:osu_benchmarks$ mpirun -np 2 ./osu_bw
# OSU MPI Bandwidth Test
# Size Bandwidth (MB/s)
1 0.39
2 0.78
4 1.58
8 3.16
16 6.35
32 12.63
64 24.91
128 49.05
256 95.30
512 179.89
1024 328.76
2048 569.84
4096 911.26
8192 1326.21
[berlin:mpi_rank_0][error_sighandler] Caught error: Segmentation fault (signal 11)
[cli_1]: aborting job:
Fatal error in PMPI_Waitall:
Other MPI error, error stack:
PMPI_Waitall(323).................: MPI_Waitall(count=64, req_array=0x822100, status_array=0xc330a0) failed
MPIR_Waitall_impl(166)............:
_MPIDI_CH3I_Progress(214).........:
MPIDI_CH3I_SMP_read_progress(1110):
MPIDI_CH3I_SMP_readv_rndv(4550)...: CMA: (MPIDI_CH3I_SMP_readv_rndv) process_vm_readv fail
===================================================================================
= BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
= PID 993 RUNNING AT berlin
= EXIT CODE: 11
= CLEANING UP REMAINING PROCESSES
= YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
More information about the mvapich-discuss
mailing list