[mvapich-discuss] MVAPICH2 on aarch64

Jonathan Perkins perkinjo at cse.ohio-state.edu
Fri Dec 18 13:36:55 EST 2015


Thanks for the report.  Can you try disabling CMA by
setting MV2_SMP_USE_CMA=0 when you run OMB?

We'll be happy to take in the provided patch as well.  Thanks again.

On Thu, Dec 17, 2015 at 11:34 PM Sreenidhi Bharathkar Ramesh <
sreenira at broadcom.com> wrote:

> Hello,
>
> I am trying the MVAPICH2 on aarch64, specifically Juno evaluation board
> and am observing run time errors.  Please note that this is a single node.
>
> Also, please note that same procedure is flawless on x86 platform.
>
> 1. For compilation, I had to include the following patch:
>
> src/mpid/ch3/channels/common/include/mv2_clock.h
> +#elif defined(__aarch64__)
> +typedef unsigned long cycles_t;
> +static inline cycles_t get_cycles()
> +{
> +    cycles_t ret;
> +
> +    asm volatile ("isb" : : : "memory");
> +    asm volatile ("mrs %0, cntvct_el0" : "=r" (ret));
> +
> +    return ret;
> +}
>
> 2. While executing OSD benchmarks, one of the following errors was seen,
> every time.
>
> a> test hangs
> b> seg fault observed
>
> 3. Questions:
>
> a> Has MVAPICH been tried on aarch64 platform ?  Am I missing any code
> delta, apart from above, in point #1 ?
> b> what could be the root cause for the error ?
>
> Please let me know.
>
> Thanks,
> - Sreenidhi.
>
>
> ----------
>
> Error appendix:
>
> sreenira at berlin:osu_benchmarks$ mpirun --version
> HYDRA build details:
>     Version:                                 3.1.4
>     Release Date:                            Thu Apr  2 17:15:15 EDT 2015
>     CC:                              gcc
>     CXX:                             g++
>     F77:                             gfortran
>     F90:                             gfortran
>     Configure options:                       '--disable-option-checking'
> '--prefix=/home/sreenira/install-mvapich2'
> '--with-ib-libpath=/home/sreenira/install-libibverbs/lib'
> '--with-ib-include=/home/sreenira/install-libibverbs/include'
> '--disable-mcast' '--cache-file=/dev/null' '--srcdir=.' 'CC=gcc' 'CFLAGS=
> -DNDEBUG -DNVALGRIND -O2' 'LDFLAGS=-L/home/sreenira/install-libibverbs/lib
> -L/lib -L/lib -L/lib -Wl,-rpath,/lib -L/lib -Wl,-rpath,/lib
> -L/home/sreenira/install-libibverbs/lib -L/lib -L/lib' 'LIBS=-libverbs -ldl
> -lrt -lm -lpthread ' 'CPPFLAGS=-I/home/sreenira/install-libibverbs/include
> -I/home/sreenira/hpcc/mvapich2-2.1/src/mpl/include
> -I/home/sreenira/hpcc/mvapich2-2.1/src/mpl/include
> -I/home/sreenira/hpcc/mvapich2-2.1/src/openpa/src
> -I/home/sreenira/hpcc/mvapich2-2.1/src/openpa/src -D_REENTRANT
> -I/home/sreenira/hpcc/mvapich2-2.1/src/mpi/romio/include -I/include
> -I/include -I/home/sreenira/install-libibverbs/include -I/include
> -I/include'
>     Process Manager:                         pmi
>     Launchers available:                     ssh rsh fork slurm ll lsf sge
> manual persist
>     Topology libraries available:            hwloc
>     Resource management kernels available:   user slurm ll lsf sge pbs
> cobalt
>     Checkpointing libraries available:
>     Demux engines available:                 poll select
>
>
> sreenira at berlin:osu_benchmarks$ mpirun -np 2 ./osu_bw
> # OSU MPI Bandwidth Test
> # Size        Bandwidth (MB/s)
> 1                         0.39
> 2                         0.78
> 4                         1.58
> 8                         3.16
> 16                        6.34
> 32                       12.61
> 64                       24.92
> 128                      49.09
> 256                      95.43
> 512                     179.93
> 1024                    323.46
> 2048                    560.85
> 4096                    898.34
> 8192                   1299.09
> [berlin:mpi_rank_0][error_sighandler] Caught error: Segmentation fault
> (signal 11)
>
>
> ===================================================================================
> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> =   PID 673 RUNNING AT berlin
> =   EXIT CODE: 11
> =   CLEANING UP REMAINING PROCESSES
> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>
> ===================================================================================
> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault
> (signal 11)
> This typically refers to a problem with your application.
> Please see the FAQ page for debugging suggestions
>
> sreenira at berlin:osu_benchmarks$ mpirun -np 2 ./osu_bw
> # OSU MPI Bandwidth Test
> # Size        Bandwidth (MB/s)
> 1                         0.39
> 2                         0.78
> 4                         1.58
> 8                         3.16
> 16                        6.35
> 32                       12.63
> 64                       24.91
> 128                      49.05
> 256                      95.30
> 512                     179.89
> 1024                    328.76
> 2048                    569.84
> 4096                    911.26
> 8192                   1326.21
> [berlin:mpi_rank_0][error_sighandler] Caught error: Segmentation fault
> (signal 11)
> [cli_1]: aborting job:
> Fatal error in PMPI_Waitall:
> Other MPI error, error stack:
> PMPI_Waitall(323).................: MPI_Waitall(count=64,
> req_array=0x822100, status_array=0xc330a0) failed
> MPIR_Waitall_impl(166)............:
> _MPIDI_CH3I_Progress(214).........:
> MPIDI_CH3I_SMP_read_progress(1110):
> MPIDI_CH3I_SMP_readv_rndv(4550)...: CMA: (MPIDI_CH3I_SMP_readv_rndv)
> process_vm_readv fail
>
>
>
> ===================================================================================
> =   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
> =   PID 993 RUNNING AT berlin
> =   EXIT CODE: 11
> =   CLEANING UP REMAINING PROCESSES
> =   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>
> ===================================================================================
> YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault
> (signal 11)
> This typically refers to a problem with your application.
> Please see the FAQ page for debugging suggestions
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20151218/8916fe6a/attachment-0001.html>


More information about the mvapich-discuss mailing list