From panda at cse.ohio-state.edu Tue Mar 14 10:41:00 2023 From: panda at cse.ohio-state.edu (Panda, Dhabaleswar) Date: Tue, 14 Mar 2023 14:41:00 +0000 Subject: [Hidl-discuss] Join the HiDL team for multiple presentations at NVIDIA GTC March 2023 In-Reply-To: References: Message-ID: The HiDL team members will be giving several presentations during the NVIDIA GTC March 2023 event (virtual). More details of these presentations are provided at: http://hidl.cse.ohio-state.edu/conference/944/talks/ Join us for these presentations and interact with the project team members!! Thanks, The HiDL Team From panda at cse.ohio-state.edu Sat Mar 18 10:36:17 2023 From: panda at cse.ohio-state.edu (Panda, Dhabaleswar) Date: Sat, 18 Mar 2023 14:36:17 +0000 Subject: [Hidl-discuss] Announcing the release of High-Performance Deep Learning (HiDL) 1.0 stack Message-ID: The High-Performance Deep Learning (HiDL) team is pleased to announce the 1.0 release of HiDL, which is a high-performance deep learning stack based on MVAPICH2 high-performance CUDA-aware communication backend. HiDL uses horovod over the MVAPICH2 and MVAPICH2-GDR backend to support large-scale distributed deep learning workload and targets modern HPC clusters built with CPUs, dense GPUs and high-performance interconnects. The 1.0 release of the HiDL stack is introducing the following features: * HiDL 1.0: - Based on Horovod - Full support for Tensorflow, PyTorch, Keras and Apache MXNet - Optimized support for MPI controller in deep learning workloads - Efficient large-message collectives (e.g. Allreduce) on various CPUs and GPUs - GPU-Direct algorithms for collective operations (incouding those commonly used for data- and model-parallelism, e.g. Allgather and Alltoall) - Support for fork safety - Exploits efficient large message collectives in MVAPICH2 and MVAPICH2-GDR - Compatible with - MVAPICH2 2.3.7, MVAPICH2-GDR 2.3.7 - Mellanox InfiniBand adapters (EDR, FDR, HDR) - Various x86-based multi-core CPUs (AMD and Intel) - NVIDIA A100, V100, P100, Quadro RTX 5000 GPUs - CUDA [9.x, 10.x, 11.x] and CuDNN [7.5.x, 7.6.x, 8.0.x, 8.2.x, 8.4.x] - AMD MI100 GPUs - ROCm [5.1.x] - Tensorflow [1.x, 2.x], Pytorch 1.x, Apache MXNet 1.x - Horovod [0.24.0, 0.25.0, 0.26.0, 0.27.0] - Python [3.x] For setting up the HiDL stack and the associated user guide, please visit the following URL: http://hidl.cse.ohio-state.edu Sample performance numbers for HiDL using Horovod synthetic benchmarks can be viewed by visiting the `Performance' tab of the above website. All questions, feedback, and bug reports are welcome. Please post to hidl-discuss at lists.osu.edu. Thanks, The High-Performance Deep Learning (HiDL) Team http://hidl.cse.ohio-state.edu