[HiBD] Announcing the release of RDMA-Apache-Spark 0.9.5

Panda, Dhabaleswar panda at cse.ohio-state.edu
Tue Feb 20 01:17:22 EST 2018


The High-Performance Big Data (HiBD) team is pleased to announce the
release of RDMA-Apache-Spark 0.9.5 with the following features.

* RDMA-Apache-Spark 0.9.5 (New features and enhancements compared to
  0.9.4 are marked as (NEW)):

    - Based on Apache Spark 2.1.0
    - (NEW) Built with Apache Hadoop 2.8.0
    - (NEW) Initial support for POWER architecture
    - (NEW) Performance optimization and tuning on OpenPOWER clusters
    - (NEW) Support for various communication thread binding policies
    - (NEW) Support for RDMA Device Selection
    - High-performance design with native InfiniBand and RoCE
      support at the verbs-level for Spark
        - RDMA-based data shuffle
        - SEDA-based shuffle architecture
        - Support pre-connection, on-demand connection and
          connection sharing
        - Non-blocking and chunk-based data transfer
        - Off-JVM-heap buffer management
    - Compliant with Apache Spark 2.1.0 APIs and applications
    - RDMA support for Spark SQL
    - Integration with HHH in RDMA-Hadoop
    - Easily configurable for native InfiniBand, RoCE, and the
      traditional sockets based support (Ethernet and InfiniBand
      with IPoIB)
    - Tested with
        - (NEW) Various multi-core platforms (e.g., x86, POWER)
        - (NEW) OpenJDK and IBM JDK
        - (NEW) Mellanox InfiniBand adapters (DDR, QDR, FDR, and EDR)
        - RoCE support with Mellanox adapters
        - Various multi-core platforms
        - RAM Disks, SSDs, and HDDs

* Bug Fixes (compared to  RDMA-Apache-Spark 0.9.4) are:
    - Fix an issue for running spark-shell with RDMA-Hadoop 1.3.0
        - Thanks to John Garbutt at StackHPC for reporting the issue

For downloading RDMA-Apache-Spark 0.9.5 package, the associated user
guide, please visit the following URL:

http://hibd.cse.ohio-state.edu

Sample performance numbers for RDMA-Apache-Spark using benchmarks can
be viewed by visiting the `Performance' tab of the above website.

All questions, feedback and bug reports are welcome. Please post to
the rdma-spark-discuss mailing list (rdma-spark-discuss at
cse.ohio-state.edu).

Thanks,

The High-Performance Big Data (HiBD) Team
http://hibd.cse.ohio-state.edu

PS: The number of organizations using the HiBD stacks has crossed 275
(from 34 countries). Similarly, the number of downloads from the HiBD
site has crossed 25,000.  The HiBD team would like to thank all its
users and organizations!!



More information about the hibd-announce mailing list