From panda at cse.ohio-state.edu  Tue Mar 14 10:41:00 2023
From: panda at cse.ohio-state.edu (Panda, Dhabaleswar)
Date: Tue, 14 Mar 2023 14:41:00 +0000
Subject: [Hidl-discuss] Join the HiDL team for multiple presentations at
 NVIDIA GTC March 2023
In-Reply-To: <CO1PR01MB72425DBBBE55CDC2450CB23DDABE9@CO1PR01MB7242.prod.exchangelabs.com>
References: <CO1PR01MB72425DBBBE55CDC2450CB23DDABE9@CO1PR01MB7242.prod.exchangelabs.com>
Message-ID: <CO1PR01MB7242140C47B02410D7CD58B8DABE9@CO1PR01MB7242.prod.exchangelabs.com>

The HiDL team members will be giving several presentations during the NVIDIA GTC March 2023 event (virtual).

More details of these presentations are provided at: http://hidl.cse.ohio-state.edu/conference/944/talks/

Join us for these presentations and interact with the project team members!!

Thanks,

The HiDL Team

From panda at cse.ohio-state.edu  Sat Mar 18 10:36:17 2023
From: panda at cse.ohio-state.edu (Panda, Dhabaleswar)
Date: Sat, 18 Mar 2023 14:36:17 +0000
Subject: [Hidl-discuss] Announcing the release of High-Performance Deep
 Learning (HiDL) 1.0 stack
Message-ID: <CO1PR01MB7242FCC3755F22A8697A7CA2DA829@CO1PR01MB7242.prod.exchangelabs.com>

The High-Performance Deep Learning (HiDL) team is pleased to announce
the 1.0 release of HiDL, which is a high-performance deep learning
stack based on MVAPICH2 high-performance CUDA-aware communication
backend. HiDL uses horovod over the MVAPICH2 and MVAPICH2-GDR backend
to support large-scale distributed deep learning workload and targets
modern HPC clusters built with CPUs, dense GPUs and high-performance
interconnects.

The 1.0 release of the HiDL stack is introducing the following features:

* HiDL 1.0:

  - Based on Horovod
  - Full support for Tensorflow, PyTorch, Keras and Apache MXNet
  - Optimized support for MPI controller in deep learning workloads
  - Efficient large-message collectives (e.g. Allreduce) on various
    CPUs and GPUs
  - GPU-Direct algorithms for collective operations (incouding those commonly
    used for data- and model-parallelism, e.g. Allgather and Alltoall)
  - Support for fork safety
  - Exploits efficient large message collectives in MVAPICH2 and MVAPICH2-GDR
  - Compatible with
    - MVAPICH2 2.3.7, MVAPICH2-GDR 2.3.7
    - Mellanox InfiniBand adapters (EDR, FDR, HDR)
    - Various x86-based multi-core CPUs (AMD and Intel)
    - NVIDIA A100, V100, P100, Quadro RTX 5000 GPUs
    - CUDA [9.x, 10.x, 11.x] and CuDNN [7.5.x, 7.6.x, 8.0.x, 8.2.x, 8.4.x]
    - AMD MI100 GPUs
    - ROCm [5.1.x]
    - Tensorflow [1.x, 2.x], Pytorch 1.x, Apache MXNet 1.x
    - Horovod [0.24.0, 0.25.0, 0.26.0, 0.27.0]
    - Python [3.x]

For setting up the HiDL stack and the associated user guide, please
visit the following URL:

http://hidl.cse.ohio-state.edu

Sample performance numbers for HiDL using Horovod synthetic benchmarks
can be viewed by visiting the `Performance' tab of the above website.

All questions, feedback, and bug reports are welcome. Please post to
hidl-discuss at lists.osu.edu.

Thanks,

The High-Performance Deep Learning (HiDL) Team
http://hidl.cse.ohio-state.edu