From panda at cse.ohio-state.edu  Fri May 13 00:42:23 2022
From: panda at cse.ohio-state.edu (Panda, Dhabaleswar)
Date: Fri, 13 May 2022 04:42:23 +0000
Subject: [Mvapich] Save the dates for the 10th Annual MVAPICH User Group
 Conference (MUG '22)
In-Reply-To: <SA0PR01MB615378C01FCD8F58004530B5DACA9@SA0PR01MB6153.prod.exchangelabs.com>
References: <SA0PR01MB615378C01FCD8F58004530B5DACA9@SA0PR01MB6153.prod.exchangelabs.com>
Message-ID: <SA0PR01MB61531A7051CAF0E0C4A0BF90DACA9@SA0PR01MB6153.prod.exchangelabs.com>

We have finalized the dates for the 10th annual MVAPICH User Group Conference (MUG '22). It will be held from August 22nd to 24th. Like the previous years, the first day (Aug 22nd) will be tutorials by sponsors and the main conference will happen on August 23rd and 24th. More details are available from http://mug.mvapich.cse.ohio-state.edu/

At this time, we are planning for the event to be in person with an option for remote attendance. Like the previous years, the in-person event will be held in Columbus, Ohio, USA. We encourage all speakers and many of the attendees to attend the event in person.

Please save these dates on your calendar. During the coming weeks, more information about the conference will be provided.

Looking forward to your participation in MUG '22!!

Thanks,

The MVAPICH Team

From panda at cse.ohio-state.edu  Thu May 19 04:46:36 2022
From: panda at cse.ohio-state.edu (Panda, Dhabaleswar)
Date: Thu, 19 May 2022 08:46:36 +0000
Subject: [Mvapich] Join the MVAPICH team for multiple events at the ISC '22
 conference (May 29- June 2, 2022)
Message-ID: <SA0PR01MB6153B84862774757B83BF994DAD09@SA0PR01MB6153.prod.exchangelabs.com>

The MVAPICH team members will be participating in multiple events at the upcoming ISC '22 conference to be held in Hamburg, Germany.

Join us for these events and interact with the project team members!!

More details of the events are provided at:

https://mvapich.cse.ohio-state.edu/conference/896/talks/

Thanks,

The MVAPICH Team

From panda at cse.ohio-state.edu  Sat May 28 07:56:10 2022
From: panda at cse.ohio-state.edu (Panda, Dhabaleswar)
Date: Sat, 28 May 2022 11:56:10 +0000
Subject: [Mvapich] Announcing the release of MVAPICH2-GDR 2.3.7 GA
Message-ID: <SA0PR01MB61535125B28EC5AF928B4FDBDADB9@SA0PR01MB6153.prod.exchangelabs.com>

The MVAPICH team is pleased to announce the release of MVAPICH2-GDR
2.3.7 GA.

The MVAPICH2-GDR 2.3.7 release incorporates several novel features as
listed below

* Support for 'on-the-fly' compression of point-to-point messages used for
  GPU to GPU communication for NVIDIA GPUs.

* Support for hybrid communication protocols using NCCL-based, CUDA-based,
  and IB verbs-based primitives for the following MPI blocking and
  non-blocking collective operations
  - MPI_Allreduce, MPI_Reduce, MPI_Allgather, MPI_Allgatherv,
    MPI_Alltoall, MPI_Alltoallv, MPI_Scatter, MPI_Scatterv,
    MPI_Gather, MPI_Gatherv, and MPI_Bcast.
  - MPI_Iallreduce, MPI_Ireduce, MPI_Iallgather, MPI_Iallgatherv,
    MPI_Ialltoall, MPI_Ialltoallv, MPI_Iscatter, MPI_Iscatterv,
    MPI_Igather, MPI_Igatherv, and MPI_Ibcast

* Full support for NVIDIA DGX, NVIDIA DGX V-100, NVIDIA DGX A-100, and
  AMD systems with Mi100 GPUs.

MVAPICH2-GDR 2.3.7 provides optimized support at MPI-level for HPC,
deep learning, machine learning, and data science workloads. These
include efficient large-message collectives (e.g.  Allreduce) on CPUs
and GPUs and GPU-Direct algorithms for all collective operations
(including those commonly used for model-parallelism, e.g.  Allgather
and Alltoall).

MVAPICH2-GDR 2.3.7 is based on the standard MVAPICH2 2.3.7 release and
incorporates designs that take advantage of the GPUDirect RDMA (GDR)
on NVIDIA GPUs and ROCmRDMA on AMD GPUs for inter-node data movement
on GPU clusters with Mellanox InfiniBand interconnect. It also
provides support for DGX-2, OpenPOWER, and NVLink2, GDRCopyv2,
efficient intra-node CUDA-Aware unified memory communication and
support for RDMA_CM, RoCE-V1, and RoCE-V2.

Features, Enhancements, and Bug Fixes for MVAPICH2-GDR 2.3.7 GA are
listed here.

* Features and Enhancements (Since 2.3.6)
    - Enhanced performance for GPU-aware MPI_Alltoall and MPI_Alltoallv
    - Added automatic rebinding of processes to cores based on GPU NUMA domain
        - This is enabled by setting the env MV2_GPU_AUTO_REBIND=1
    - Added NCCL communication substrate for various non-blocking MPI
      collectives
        - MPI_Iallreduce, MPI_Ireduce, MPI_Iallgather, MPI_Iallgatherv,
          MPI_Ialltoall, MPI_Ialltoallv, MPI_Iscatter, MPI_Iscatterv,
          MPI_Igather, MPI_Igatherv, and MPI_Ibcast
    - Enhanced point-to-point and collective tuning for AMD Milan processors
      with NVIDIA A-100 and AMD Mi100 GPUs
    - Enhanced point-to-point and collective tuning for NVIDIA DGX A-100 systems
    - Added support for Cray Slingshot-10 interconnect

Further, MVAPICH2-GDR 2.3.7 GA provides support for GPU-Cluster using
regular OFED (without GPUDirect RDMA).

MVAPICH2-GDR 2.3.7 GA continues to deliver excellent performance. It
provides inter-node Device-to-Device latency of 1.85 microseconds (8
bytes) with CUDA 10.1 and Volta GPUs. On OpenPOWER platforms with
NVLink2, it delivers up to 70.4 GBps unidirectional intra-node
Device-to-Device bandwidth for large messages. On DGX-2 platforms, it
delivers up to 144.79 GBps unidirectional intra-node Device-to-Device
bandwidth for large messages. More performance numbers are available
from the MVAPICH website (under Performance->MV2-GDR->CUDA)).

For downloading MVAPICH2-GDR 2.3.7 GA and associated user
guides, please visit the following URL:

http://mvapich.cse.ohio-state.edu

All questions, feedback, bug reports, hints for performance tuning,
patches, and enhancements are welcome. Please post it to the
mvapich-discuss mailing list (mvapich-discuss at lists.osu.edu).

Thanks,

The MVAPICH Team

PS: We are also happy to inform you that the number of organizations
using MVAPICH2 libraries (and registered at the MVAPICH site) has
crossed 3,200 worldwide (in 89 countries). The number of downloads
from the MVAPICH has crossed 1,590,000 (1.59 million).  The MVAPICH
team would like to thank all its users and organizations!!


From panda at cse.ohio-state.edu  Fri May 13 00:42:23 2022
From: panda at cse.ohio-state.edu (Panda, Dhabaleswar)
Date: Fri, 13 May 2022 04:42:23 +0000
Subject: [Mvapich] Save the dates for the 10th Annual MVAPICH User Group
 Conference (MUG '22)
In-Reply-To: <SA0PR01MB615378C01FCD8F58004530B5DACA9@SA0PR01MB6153.prod.exchangelabs.com>
References: <SA0PR01MB615378C01FCD8F58004530B5DACA9@SA0PR01MB6153.prod.exchangelabs.com>
Message-ID: <SA0PR01MB61531A7051CAF0E0C4A0BF90DACA9@SA0PR01MB6153.prod.exchangelabs.com>

We have finalized the dates for the 10th annual MVAPICH User Group Conference (MUG '22). It will be held from August 22nd to 24th. Like the previous years, the first day (Aug 22nd) will be tutorials by sponsors and the main conference will happen on August 23rd and 24th. More details are available from http://mug.mvapich.cse.ohio-state.edu/

At this time, we are planning for the event to be in person with an option for remote attendance. Like the previous years, the in-person event will be held in Columbus, Ohio, USA. We encourage all speakers and many of the attendees to attend the event in person.

Please save these dates on your calendar. During the coming weeks, more information about the conference will be provided.

Looking forward to your participation in MUG '22!!

Thanks,

The MVAPICH Team

From panda at cse.ohio-state.edu  Thu May 19 04:46:36 2022
From: panda at cse.ohio-state.edu (Panda, Dhabaleswar)
Date: Thu, 19 May 2022 08:46:36 +0000
Subject: [Mvapich] Join the MVAPICH team for multiple events at the ISC '22
 conference (May 29- June 2, 2022)
Message-ID: <SA0PR01MB6153B84862774757B83BF994DAD09@SA0PR01MB6153.prod.exchangelabs.com>

The MVAPICH team members will be participating in multiple events at the upcoming ISC '22 conference to be held in Hamburg, Germany.

Join us for these events and interact with the project team members!!

More details of the events are provided at:

https://mvapich.cse.ohio-state.edu/conference/896/talks/

Thanks,

The MVAPICH Team

From panda at cse.ohio-state.edu  Sat May 28 07:56:10 2022
From: panda at cse.ohio-state.edu (Panda, Dhabaleswar)
Date: Sat, 28 May 2022 11:56:10 +0000
Subject: [Mvapich] Announcing the release of MVAPICH2-GDR 2.3.7 GA
Message-ID: <SA0PR01MB61535125B28EC5AF928B4FDBDADB9@SA0PR01MB6153.prod.exchangelabs.com>

The MVAPICH team is pleased to announce the release of MVAPICH2-GDR
2.3.7 GA.

The MVAPICH2-GDR 2.3.7 release incorporates several novel features as
listed below

* Support for 'on-the-fly' compression of point-to-point messages used for
  GPU to GPU communication for NVIDIA GPUs.

* Support for hybrid communication protocols using NCCL-based, CUDA-based,
  and IB verbs-based primitives for the following MPI blocking and
  non-blocking collective operations
  - MPI_Allreduce, MPI_Reduce, MPI_Allgather, MPI_Allgatherv,
    MPI_Alltoall, MPI_Alltoallv, MPI_Scatter, MPI_Scatterv,
    MPI_Gather, MPI_Gatherv, and MPI_Bcast.
  - MPI_Iallreduce, MPI_Ireduce, MPI_Iallgather, MPI_Iallgatherv,
    MPI_Ialltoall, MPI_Ialltoallv, MPI_Iscatter, MPI_Iscatterv,
    MPI_Igather, MPI_Igatherv, and MPI_Ibcast

* Full support for NVIDIA DGX, NVIDIA DGX V-100, NVIDIA DGX A-100, and
  AMD systems with Mi100 GPUs.

MVAPICH2-GDR 2.3.7 provides optimized support at MPI-level for HPC,
deep learning, machine learning, and data science workloads. These
include efficient large-message collectives (e.g.  Allreduce) on CPUs
and GPUs and GPU-Direct algorithms for all collective operations
(including those commonly used for model-parallelism, e.g.  Allgather
and Alltoall).

MVAPICH2-GDR 2.3.7 is based on the standard MVAPICH2 2.3.7 release and
incorporates designs that take advantage of the GPUDirect RDMA (GDR)
on NVIDIA GPUs and ROCmRDMA on AMD GPUs for inter-node data movement
on GPU clusters with Mellanox InfiniBand interconnect. It also
provides support for DGX-2, OpenPOWER, and NVLink2, GDRCopyv2,
efficient intra-node CUDA-Aware unified memory communication and
support for RDMA_CM, RoCE-V1, and RoCE-V2.

Features, Enhancements, and Bug Fixes for MVAPICH2-GDR 2.3.7 GA are
listed here.

* Features and Enhancements (Since 2.3.6)
    - Enhanced performance for GPU-aware MPI_Alltoall and MPI_Alltoallv
    - Added automatic rebinding of processes to cores based on GPU NUMA domain
        - This is enabled by setting the env MV2_GPU_AUTO_REBIND=1
    - Added NCCL communication substrate for various non-blocking MPI
      collectives
        - MPI_Iallreduce, MPI_Ireduce, MPI_Iallgather, MPI_Iallgatherv,
          MPI_Ialltoall, MPI_Ialltoallv, MPI_Iscatter, MPI_Iscatterv,
          MPI_Igather, MPI_Igatherv, and MPI_Ibcast
    - Enhanced point-to-point and collective tuning for AMD Milan processors
      with NVIDIA A-100 and AMD Mi100 GPUs
    - Enhanced point-to-point and collective tuning for NVIDIA DGX A-100 systems
    - Added support for Cray Slingshot-10 interconnect

Further, MVAPICH2-GDR 2.3.7 GA provides support for GPU-Cluster using
regular OFED (without GPUDirect RDMA).

MVAPICH2-GDR 2.3.7 GA continues to deliver excellent performance. It
provides inter-node Device-to-Device latency of 1.85 microseconds (8
bytes) with CUDA 10.1 and Volta GPUs. On OpenPOWER platforms with
NVLink2, it delivers up to 70.4 GBps unidirectional intra-node
Device-to-Device bandwidth for large messages. On DGX-2 platforms, it
delivers up to 144.79 GBps unidirectional intra-node Device-to-Device
bandwidth for large messages. More performance numbers are available
from the MVAPICH website (under Performance->MV2-GDR->CUDA)).

For downloading MVAPICH2-GDR 2.3.7 GA and associated user
guides, please visit the following URL:

http://mvapich.cse.ohio-state.edu

All questions, feedback, bug reports, hints for performance tuning,
patches, and enhancements are welcome. Please post it to the
mvapich-discuss mailing list (mvapich-discuss at lists.osu.edu).

Thanks,

The MVAPICH Team

PS: We are also happy to inform you that the number of organizations
using MVAPICH2 libraries (and registered at the MVAPICH site) has
crossed 3,200 worldwide (in 89 countries). The number of downloads
from the MVAPICH has crossed 1,590,000 (1.59 million).  The MVAPICH
team would like to thank all its users and organizations!!


From panda at cse.ohio-state.edu  Fri May 13 00:42:23 2022
From: panda at cse.ohio-state.edu (Panda, Dhabaleswar)
Date: Fri, 13 May 2022 04:42:23 +0000
Subject: [Mvapich] Save the dates for the 10th Annual MVAPICH User Group
 Conference (MUG '22)
In-Reply-To: <SA0PR01MB615378C01FCD8F58004530B5DACA9@SA0PR01MB6153.prod.exchangelabs.com>
References: <SA0PR01MB615378C01FCD8F58004530B5DACA9@SA0PR01MB6153.prod.exchangelabs.com>
Message-ID: <SA0PR01MB61531A7051CAF0E0C4A0BF90DACA9@SA0PR01MB6153.prod.exchangelabs.com>

We have finalized the dates for the 10th annual MVAPICH User Group Conference (MUG '22). It will be held from August 22nd to 24th. Like the previous years, the first day (Aug 22nd) will be tutorials by sponsors and the main conference will happen on August 23rd and 24th. More details are available from http://mug.mvapich.cse.ohio-state.edu/

At this time, we are planning for the event to be in person with an option for remote attendance. Like the previous years, the in-person event will be held in Columbus, Ohio, USA. We encourage all speakers and many of the attendees to attend the event in person.

Please save these dates on your calendar. During the coming weeks, more information about the conference will be provided.

Looking forward to your participation in MUG '22!!

Thanks,

The MVAPICH Team

From panda at cse.ohio-state.edu  Thu May 19 04:46:36 2022
From: panda at cse.ohio-state.edu (Panda, Dhabaleswar)
Date: Thu, 19 May 2022 08:46:36 +0000
Subject: [Mvapich] Join the MVAPICH team for multiple events at the ISC '22
 conference (May 29- June 2, 2022)
Message-ID: <SA0PR01MB6153B84862774757B83BF994DAD09@SA0PR01MB6153.prod.exchangelabs.com>

The MVAPICH team members will be participating in multiple events at the upcoming ISC '22 conference to be held in Hamburg, Germany.

Join us for these events and interact with the project team members!!

More details of the events are provided at:

https://mvapich.cse.ohio-state.edu/conference/896/talks/

Thanks,

The MVAPICH Team

From panda at cse.ohio-state.edu  Sat May 28 07:56:10 2022
From: panda at cse.ohio-state.edu (Panda, Dhabaleswar)
Date: Sat, 28 May 2022 11:56:10 +0000
Subject: [Mvapich] Announcing the release of MVAPICH2-GDR 2.3.7 GA
Message-ID: <SA0PR01MB61535125B28EC5AF928B4FDBDADB9@SA0PR01MB6153.prod.exchangelabs.com>

The MVAPICH team is pleased to announce the release of MVAPICH2-GDR
2.3.7 GA.

The MVAPICH2-GDR 2.3.7 release incorporates several novel features as
listed below

* Support for 'on-the-fly' compression of point-to-point messages used for
  GPU to GPU communication for NVIDIA GPUs.

* Support for hybrid communication protocols using NCCL-based, CUDA-based,
  and IB verbs-based primitives for the following MPI blocking and
  non-blocking collective operations
  - MPI_Allreduce, MPI_Reduce, MPI_Allgather, MPI_Allgatherv,
    MPI_Alltoall, MPI_Alltoallv, MPI_Scatter, MPI_Scatterv,
    MPI_Gather, MPI_Gatherv, and MPI_Bcast.
  - MPI_Iallreduce, MPI_Ireduce, MPI_Iallgather, MPI_Iallgatherv,
    MPI_Ialltoall, MPI_Ialltoallv, MPI_Iscatter, MPI_Iscatterv,
    MPI_Igather, MPI_Igatherv, and MPI_Ibcast

* Full support for NVIDIA DGX, NVIDIA DGX V-100, NVIDIA DGX A-100, and
  AMD systems with Mi100 GPUs.

MVAPICH2-GDR 2.3.7 provides optimized support at MPI-level for HPC,
deep learning, machine learning, and data science workloads. These
include efficient large-message collectives (e.g.  Allreduce) on CPUs
and GPUs and GPU-Direct algorithms for all collective operations
(including those commonly used for model-parallelism, e.g.  Allgather
and Alltoall).

MVAPICH2-GDR 2.3.7 is based on the standard MVAPICH2 2.3.7 release and
incorporates designs that take advantage of the GPUDirect RDMA (GDR)
on NVIDIA GPUs and ROCmRDMA on AMD GPUs for inter-node data movement
on GPU clusters with Mellanox InfiniBand interconnect. It also
provides support for DGX-2, OpenPOWER, and NVLink2, GDRCopyv2,
efficient intra-node CUDA-Aware unified memory communication and
support for RDMA_CM, RoCE-V1, and RoCE-V2.

Features, Enhancements, and Bug Fixes for MVAPICH2-GDR 2.3.7 GA are
listed here.

* Features and Enhancements (Since 2.3.6)
    - Enhanced performance for GPU-aware MPI_Alltoall and MPI_Alltoallv
    - Added automatic rebinding of processes to cores based on GPU NUMA domain
        - This is enabled by setting the env MV2_GPU_AUTO_REBIND=1
    - Added NCCL communication substrate for various non-blocking MPI
      collectives
        - MPI_Iallreduce, MPI_Ireduce, MPI_Iallgather, MPI_Iallgatherv,
          MPI_Ialltoall, MPI_Ialltoallv, MPI_Iscatter, MPI_Iscatterv,
          MPI_Igather, MPI_Igatherv, and MPI_Ibcast
    - Enhanced point-to-point and collective tuning for AMD Milan processors
      with NVIDIA A-100 and AMD Mi100 GPUs
    - Enhanced point-to-point and collective tuning for NVIDIA DGX A-100 systems
    - Added support for Cray Slingshot-10 interconnect

Further, MVAPICH2-GDR 2.3.7 GA provides support for GPU-Cluster using
regular OFED (without GPUDirect RDMA).

MVAPICH2-GDR 2.3.7 GA continues to deliver excellent performance. It
provides inter-node Device-to-Device latency of 1.85 microseconds (8
bytes) with CUDA 10.1 and Volta GPUs. On OpenPOWER platforms with
NVLink2, it delivers up to 70.4 GBps unidirectional intra-node
Device-to-Device bandwidth for large messages. On DGX-2 platforms, it
delivers up to 144.79 GBps unidirectional intra-node Device-to-Device
bandwidth for large messages. More performance numbers are available
from the MVAPICH website (under Performance->MV2-GDR->CUDA)).

For downloading MVAPICH2-GDR 2.3.7 GA and associated user
guides, please visit the following URL:

http://mvapich.cse.ohio-state.edu

All questions, feedback, bug reports, hints for performance tuning,
patches, and enhancements are welcome. Please post it to the
mvapich-discuss mailing list (mvapich-discuss at lists.osu.edu).

Thanks,

The MVAPICH Team

PS: We are also happy to inform you that the number of organizations
using MVAPICH2 libraries (and registered at the MVAPICH site) has
crossed 3,200 worldwide (in 89 countries). The number of downloads
from the MVAPICH has crossed 1,590,000 (1.59 million).  The MVAPICH
team would like to thank all its users and organizations!!