From panda at cse.ohio-state.edu Wed Nov 8 12:01:19 2023 From: panda at cse.ohio-state.edu (Panda, Dhabaleswar) Date: Wed, 8 Nov 2023 17:01:19 +0000 Subject: [Hidl-discuss] Join the HiDL team for multiple events at SC '23 In-Reply-To: References: Message-ID: The HiDL team members will be participating in multiple events during Supercomputing '23 conference. The Ohio State University (OSU) booth (#1680) will also feature leading speakers from academia (Case Western Reserve University, KAUST-Saudi Arabia, and Univ. of Oregon), national laboratories/centers (ETRI-South Korea, Idaho National Lab, Ohio Supercomputer Center, and San Diego Supercomputer Center), and industry (Broadcom, C-DAC-India, Dell, ParaTools, and X-ScaleSolutions!! Join us for these events and talk in person with the project team members and the invited speakers!! More details of the events are provided at: http://mvapich.cse.ohio-state.edu/conference/964/talks/ Alternatively, you can use the attached QR code to view the event details. Pick-up a free T-shirt at the OSU Booth after attending the events! Thanks, The HiDL Team -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: sc23-qr-code.png Type: image/png Size: 2075 bytes Desc: sc23-qr-code.png URL: From panda at cse.ohio-state.edu Thu Nov 9 14:10:10 2023 From: panda at cse.ohio-state.edu (Panda, Dhabaleswar) Date: Thu, 9 Nov 2023 19:10:10 +0000 Subject: [Hidl-discuss] Announcing the release of MPI4DL 0.6 Message-ID: The High-Performance Deep Learning (HiDL) team is pleased to announce the release of MPI4DL 0.6, which is a distributed and accelerated training framework for very high-resolution images that integrates Spatial Parallelism, Layer Parallelism, Bidirectional Parallelism, and Pipeline Parallelism with support for the MVAPICH2-GDR high-performance CUDA-aware communication backend. This library allows MPI-driven converged software infrastructure to extract maximum performance and scalability for AI, Big Data and Data Science applications and workflows on modern heterogeneous clusters consisting of diverse CPUs, GPUs, and Interconnects (InfiniBand, ROCE, Omni-Path, iWARP, and SlingShot). The new features available with this release of the MPI4DL package are as follows: * MPI4DL 0.6: * Based on PyTorch * (NEW) Support for training very high-resolution images * Distributed training support for: * Layer Parallelism (LP) * Pipeline Parallelism (PP) * Spatial Parallelism (SP) * Spatial and Layer Parallelism (SP+LP) * Spatial and Pipeline Parallelism (SP+PP) * (NEW) Bidirectional and Layer Parallelism (GEMS+LP) * (NEW) Bidirectional and Pipeline Parallelism (GEMS+PP) * (NEW) Spatial, Bidirectional and Layer Parallelism (SP+GEMS+LP) * (NEW) Spatial, Bidirectional and Pipeline Parallelism (SP+GEMS+PP) * (NEW) Support for AmoebaNet and ResNet models * (NEW) Support for different image sizes and custom datasets * Exploits collective features of MVAPICH2-GDR * Compatible with * NVIDIA GPU A100 and V100 * CUDA [11.6, 11.7] * Python >= 3.8 * PyTorch [1.12.1 , 1.13.1] * MVAPICH2-GDR = 2.3.7 The MPI4DL package is open-source, and hosted at the following URL: https://github.com/OSU-Nowlab/MPI4DL. For associated release information, please visit the following URL: http://hidl.cse.ohio-state.edu Sample performance numbers for MPI4DL using deep learning application benchmarks can be viewed by visiting the `Performance' tab of the above website. All questions, feedback, and bug reports are welcome. Please post to hidl-discuss at lists.osu.edu. Thanks, The High-Performance Deep Learning (HiDL) Team http://hidl.cse.ohio-state.edu PS: The number of organizations using the HiDL stacks has crossed 88 (from 21 countries). The HiDL team would like to thank all its users and organizations!! -------------- next part -------------- An HTML attachment was scrubbed... URL: From panda at cse.ohio-state.edu Thu Nov 9 17:42:39 2023 From: panda at cse.ohio-state.edu (Panda, Dhabaleswar) Date: Thu, 9 Nov 2023 22:42:39 +0000 Subject: [Hidl-discuss] Announcing the release of ParaInfer-X v1.0 for High-Performance Parallel Inference Message-ID: The High-Performance Deep Learning (HiDL) team is pleased to announce the release of ParaInfer-X v1.0, which is a collection of parallel inference techniques that can facilitate the deployment of emerging AI models on edge devices and HPC clusters. This package leverages highly performant GPU kernels that maximize computational throughput, intelligent scheduling strategies that ensure optimal load balancing across resources, and sophisticated distributed communication libraries that facilitate large-scale inference by enabling seamless data exchange and coordination among distributed systems. ParaInfer-X v1.0 proposes a temporal fusion framework, named Flover, to smartly batch multiple requests during LLM generation, which is also known as temporal fusion/in-flight batching. The new features available with this release of the ParaInfer-X package are as follows: * Based on Faster Transformer * (NEW) Support for inference of various large language models: * (NEW) GPT-J 6B * (NEW) LlaMA 7B * (NEW) LlaMA 13B * (NEW) LlaMA 33B * (NEW) LlaMA 65B * (NEW) Support for persistent model inference stream * (NEW) Support for temporal fusion/in-flight batching of multiple requests * (NEW) Support for multiple GPU tensor parallelism * (NEW) Support for asynchronous memory reordering for evicting finished requests * (NEW) Support for float32, float16, bfloat16 for model inference * Compatible with * (NEW) NVIDIA GPU A100 and V100 * (NEW) CUDA [11.2, 11.3, 11.4, 11.6] * (NEW) GCC >= 8.5.0 * (NEW) CMAKE >= 3.18 * (NEW) Intel oneTBB >= v2020.0 * (NEW) Customized CUDA kernels * (NEW) Support for visualization output of inference progress The ParaInfer-X package is open-source, and hosted at the following URL: https://github.com/OSU-Nowlab/Flover For associated release information, please visit the following URL: http://hidl.cse.ohio-state.edu Sample performance numbers for ParaInfer-X using inference benchmarks can be viewed by visiting the `Performance' tab of the above website. All questions, feedback, and bug reports are welcome. Please post to hidl-discuss at lists.osu.edu. Thanks, The High-Performance Deep Learning (HiDL) Team http://hidl.cse.ohio-state.edu PS: The number of organizations using the HiDL stacks has crossed 88 (from 21 countries). The HiDL team would like to thank all its users and organizations!! -------------- next part -------------- An HTML attachment was scrubbed... URL: