[mvapich-discuss] MVAPICH2 on aarch64
Jonathan Perkins
perkinjo at cse.ohio-state.edu
Fri Dec 18 13:36:55 EST 2015
Thanks for the report. Can you try disabling CMA by
setting MV2_SMP_USE_CMA=0 when you run OMB?
We'll be happy to take in the provided patch as well. Thanks again.
On Thu, Dec 17, 2015 at 11:34 PM Sreenidhi Bharathkar Ramesh <
sreenira at broadcom.com> wrote:
> Hello,
>
> I am trying the MVAPICH2 on aarch64, specifically Juno evaluation board
> and am observing run time errors. Please note that this is a single node.
>
> Also, please note that same procedure is flawless on x86 platform.
>
> 1. For compilation, I had to include the following patch:
>
> src/mpid/ch3/channels/common/include/mv2_clock.h
> +#elif defined(__aarch64__)
> +typedef unsigned long cycles_t;
> +static inline cycles_t get_cycles()
> +{
> + cycles_t ret;
> +
> + asm volatile ("isb" : : : "memory");
> + asm volatile ("mrs %0, cntvct_el0" : "=r" (ret));
> +
> + return ret;
> +}
>
> 2. While executing OSD benchmarks, one of the following errors was seen,
> every time.
>
> a> test hangs
> b> seg fault observed
>
> 3. Questions:
>
> a> Has MVAPICH been tried on aarch64 platform ? Am I missing any code
> delta, apart from above, in point #1 ?
> b> what could be the root cause for the error ?
>
> Please let me know.
>
> Thanks,
> - Sreenidhi.
>
>
> ----------
>
> Error appendix:
>
> sreenira at berlin:osu_benchmarks$ mpirun --version
> HYDRA build details:
> Version: 3.1.4
> Release Date: Thu Apr 2 17:15:15 EDT 2015
> CC: gcc
> CXX: g++
> F77: gfortran
> F90: gfortran
> Configure options: '--disable-option-checking'
> '--prefix=/home/sreenira/install-mvapich2'
> '--with-ib-libpath=/home/sreenira/install-libibverbs/lib'
> '--with-ib-include=/home/sreenira/install-libibverbs/include'
> '--disable-mcast' '--cache-file=/dev/null' '--srcdir=.' 'CC=gcc' 'CFLAGS=
> -DNDEBUG -DNVALGRIND -O2' 'LDFLAGS=-L/home/sreenira/install-libibverbs/lib
> -L/lib -L/lib -L/lib -Wl,-rpath,/lib -L/lib -Wl,-rpath,/lib
> -L/home/sreenira/install-libibverbs/lib -L/lib -L/lib' 'LIBS=-libverbs -ldl
> -lrt -lm -lpthread ' 'CPPFLAGS=-I/home/sreenira/install-libibverbs/include
> -I/home/sreenira/hpcc/mvapich2-2.1/src/mpl/include
> -I/home/sreenira/hpcc/mvapich2-2.1/src/mpl/include
> -I/home/sreenira/hpcc/mvapich2-2.1/src/openpa/src
> -I/home/sreenira/hpcc/mvapich2-2.1/src/openpa/src -D_REENTRANT
> -I/home/sreenira/hpcc/mvapich2-2.1/src/mpi/romio/include -I/include
> -I/include -I/home/sreenira/install-libibverbs/include -I/include
> -I/include'
> Process Manager: pmi
> Launchers available: ssh rsh fork slurm ll lsf sge
> manual persist
> Topology libraries available: hwloc
> Resource management kernels available: user slurm ll lsf sge pbs
> cobalt
> Checkpointing libraries available:
> Demux engines available: poll select
>
>
> sreenira at berlin:osu_benchmarks$ mpirun -np 2 ./osu_bw
> # OSU MPI Bandwidth Test
> # Size Bandwidth (MB/s)
> 1 0.39
> 2 0.78
> 4 1.58
> 8 3.16
> 16 6.34
> 32 12.61
> 64 24.92
> 128 49.09
> 256 95.43
> 512 179.93
> 1024 323.46
> 2048 560.85
> 4096 898.34
> 8192 1299.09
> [berlin:mpi_rank_0][error_sighandler] Caught error: Segmentation fault
> (signal 11)
>
>
> ===================================================================================
> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> = PID 673 RUNNING AT berlin
> = EXIT CODE: 11
> = CLEANING UP REMAINING PROCESSES
> = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>
> ===================================================================================
> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault
> (signal 11)
> This typically refers to a problem with your application.
> Please see the FAQ page for debugging suggestions
>
> sreenira at berlin:osu_benchmarks$ mpirun -np 2 ./osu_bw
> # OSU MPI Bandwidth Test
> # Size Bandwidth (MB/s)
> 1 0.39
> 2 0.78
> 4 1.58
> 8 3.16
> 16 6.35
> 32 12.63
> 64 24.91
> 128 49.05
> 256 95.30
> 512 179.89
> 1024 328.76
> 2048 569.84
> 4096 911.26
> 8192 1326.21
> [berlin:mpi_rank_0][error_sighandler] Caught error: Segmentation fault
> (signal 11)
> [cli_1]: aborting job:
> Fatal error in PMPI_Waitall:
> Other MPI error, error stack:
> PMPI_Waitall(323).................: MPI_Waitall(count=64,
> req_array=0x822100, status_array=0xc330a0) failed
> MPIR_Waitall_impl(166)............:
> _MPIDI_CH3I_Progress(214).........:
> MPIDI_CH3I_SMP_read_progress(1110):
> MPIDI_CH3I_SMP_readv_rndv(4550)...: CMA: (MPIDI_CH3I_SMP_readv_rndv)
> process_vm_readv fail
>
>
>
> ===================================================================================
> = BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> = PID 993 RUNNING AT berlin
> = EXIT CODE: 11
> = CLEANING UP REMAINING PROCESSES
> = YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>
> ===================================================================================
> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault
> (signal 11)
> This typically refers to a problem with your application.
> Please see the FAQ page for debugging suggestions
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20151218/8916fe6a/attachment-0001.html>
More information about the mvapich-discuss
mailing list