From panda at cse.ohio-state.edu  Wed Nov  8 12:01:19 2023
From: panda at cse.ohio-state.edu (Panda, Dhabaleswar)
Date: Wed, 8 Nov 2023 17:01:19 +0000
Subject: [Hidl-discuss] Join the HiDL team for multiple events at SC '23
In-Reply-To: <CO1PR01MB7242C7423F5D43F69A633D4CDAA8A@CO1PR01MB7242.prod.exchangelabs.com>
References: <CO1PR01MB7242C7423F5D43F69A633D4CDAA8A@CO1PR01MB7242.prod.exchangelabs.com>
Message-ID: <CO1PR01MB7242E08B5E6148A58C221767DAA8A@CO1PR01MB7242.prod.exchangelabs.com>

The HiDL team members will be participating in multiple events during Supercomputing '23 conference.  The Ohio State University (OSU) booth (#1680) will also feature leading speakers from academia (Case Western Reserve University, KAUST-Saudi Arabia, and Univ. of Oregon), national laboratories/centers (ETRI-South Korea, Idaho National Lab, Ohio Supercomputer Center, and San Diego Supercomputer Center), and industry (Broadcom, C-DAC-India, Dell, ParaTools, and X-ScaleSolutions!!

Join us for these events and talk in person with the project team members and the invited speakers!!

More details of the events are provided at:

http://mvapich.cse.ohio-state.edu/conference/964/talks/

Alternatively, you can use the attached QR code to view the event details.

Pick-up a free T-shirt at the OSU Booth after attending the events!

Thanks,

The HiDL Team
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/hidl-discuss/attachments/20231108/61b6ab01/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: sc23-qr-code.png
Type: image/png
Size: 2075 bytes
Desc: sc23-qr-code.png
URL: <http://lists.osu.edu/pipermail/hidl-discuss/attachments/20231108/61b6ab01/attachment.png>

From panda at cse.ohio-state.edu  Thu Nov  9 14:10:10 2023
From: panda at cse.ohio-state.edu (Panda, Dhabaleswar)
Date: Thu, 9 Nov 2023 19:10:10 +0000
Subject: [Hidl-discuss] Announcing the release of MPI4DL 0.6
Message-ID: <CO1PR01MB7242E48D3F1A70A547500657DAAFA@CO1PR01MB7242.prod.exchangelabs.com>

The High-Performance Deep Learning (HiDL) team is pleased to announce
the release of MPI4DL 0.6, which is a distributed and accelerated training
framework for very high-resolution images that integrates Spatial Parallelism,
Layer Parallelism, Bidirectional Parallelism, and Pipeline Parallelism with
support for the MVAPICH2-GDR high-performance CUDA-aware
communication backend.

This library allows MPI-driven converged software infrastructure to extract
maximum performance and scalability for AI, Big Data and Data Science
applications and workflows on modern heterogeneous clusters consisting
of diverse CPUs, GPUs, and Interconnects (InfiniBand, ROCE, Omni-Path, iWARP,
and SlingShot).

The new features available with this release of the MPI4DL package are as follows:

* MPI4DL 0.6:

  *   Based on PyTorch
  *   (NEW) Support for training very high-resolution images
     *   Distributed training support for:
        *   Layer Parallelism (LP)
        *   Pipeline Parallelism (PP)
        *   Spatial Parallelism (SP)
        *   Spatial and Layer Parallelism (SP+LP)
        *   Spatial and Pipeline Parallelism (SP+PP)
        *   (NEW) Bidirectional and Layer Parallelism (GEMS+LP)
        *   (NEW) Bidirectional and Pipeline Parallelism (GEMS+PP)
        *   (NEW) Spatial, Bidirectional and Layer Parallelism (SP+GEMS+LP)
        *   (NEW) Spatial, Bidirectional and Pipeline Parallelism (SP+GEMS+PP)
     *   (NEW) Support for AmoebaNet and ResNet models
     *   (NEW) Support for different image sizes and custom datasets
  *   Exploits collective features of MVAPICH2-GDR
  *   Compatible with
     *   NVIDIA GPU A100 and V100
     *   CUDA [11.6, 11.7]
     *   Python >= 3.8
     *   PyTorch [1.12.1 , 1.13.1]
     *   MVAPICH2-GDR = 2.3.7
The MPI4DL package is open-source, and hosted at the following URL:

https://github.com/OSU-Nowlab/MPI4DL.

For associated release information, please visit the following URL:

http://hidl.cse.ohio-state.edu<http://hidl.cse.ohio-state.edu/>
Sample performance numbers for MPI4DL using deep learning
application benchmarks can be viewed by visiting the `Performance' tab
of the above website.

All questions, feedback, and bug reports are welcome. Please post to
hidl-discuss at lists.osu.edu<mailto:hidl-discuss at lists.osu.edu>.

Thanks,

The High-Performance Deep Learning (HiDL) Team
http://hidl.cse.ohio-state.edu<http://hidl.cse.ohio-state.edu/>

PS: The number of organizations using the HiDL stacks has crossed 88
(from 21 countries).  The HiDL team would like to thank all its users
and organizations!!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/hidl-discuss/attachments/20231109/e7378aef/attachment-0001.html>

From panda at cse.ohio-state.edu  Thu Nov  9 17:42:39 2023
From: panda at cse.ohio-state.edu (Panda, Dhabaleswar)
Date: Thu, 9 Nov 2023 22:42:39 +0000
Subject: [Hidl-discuss] Announcing the release of ParaInfer-X v1.0 for
 High-Performance Parallel Inference
Message-ID: <CO1PR01MB7242285E1B9FEF14F6D48D55DAAFA@CO1PR01MB7242.prod.exchangelabs.com>

The High-Performance Deep Learning (HiDL) team is pleased to announce
the release of ParaInfer-X v1.0, which is a collection of parallel inference techniques
that can facilitate the deployment of emerging AI models on edge devices and HPC clusters.

This package leverages highly performant GPU kernels that maximize computational throughput,
intelligent scheduling strategies that ensure optimal load balancing across resources,
and sophisticated distributed communication libraries that facilitate large-scale
inference by enabling seamless data exchange and coordination among
distributed systems. ParaInfer-X v1.0 proposes a temporal fusion framework,
named Flover, to smartly batch multiple requests during LLM generation,
which is also known as temporal fusion/in-flight batching.

The new features available with this release of the ParaInfer-X package are as follows:

  *   Based on Faster Transformer
  *   (NEW) Support for inference of various large language models:
     *   (NEW) GPT-J 6B
     *   (NEW) LlaMA 7B
     *   (NEW) LlaMA 13B
     *   (NEW) LlaMA 33B
     *   (NEW) LlaMA 65B
  *   (NEW) Support for persistent model inference stream
  *   (NEW) Support for temporal fusion/in-flight batching of multiple requests
  *   (NEW) Support for multiple GPU tensor parallelism
  *   (NEW) Support for asynchronous memory reordering for evicting finished requests
  *   (NEW) Support for float32, float16, bfloat16 for model inference
  *   Compatible with
     *   (NEW) NVIDIA GPU A100 and V100
     *   (NEW) CUDA [11.2, 11.3, 11.4, 11.6]
     *   (NEW) GCC >= 8.5.0
     *   (NEW) CMAKE >= 3.18
     *   (NEW) Intel oneTBB >= v2020.0
     *   (NEW) Customized CUDA kernels
  *   (NEW) Support for visualization output of inference progress
The ParaInfer-X package is open-source, and hosted at the following URL:

https://github.com/OSU-Nowlab/Flover

For associated release information, please visit the following URL:

http://hidl.cse.ohio-state.edu
Sample performance numbers for ParaInfer-X using inference
benchmarks can be viewed by visiting the `Performance' tab
of the above website.

All questions, feedback, and bug reports are welcome. Please post to
hidl-discuss at lists.osu.edu<mailto:hidl-discuss at lists.osu.edu>.

Thanks,

The High-Performance Deep Learning (HiDL) Team
http://hidl.cse.ohio-state.edu<http://hidl.cse.ohio-state.edu/>

PS: The number of organizations using the HiDL stacks has crossed 88
(from 21 countries).  The HiDL team would like to thank all its users
and organizations!!


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/hidl-discuss/attachments/20231109/1ad86955/attachment.html>