From zyou at osc.edu Tue Mar 5 19:39:41 2024 From: zyou at osc.edu (You, Zhi-Qiang) Date: Wed, 6 Mar 2024 00:39:41 +0000 Subject: [Mvapich-discuss] Install mvapich2-gdr using Spack Message-ID: Hello, I am attempting to use Spack 0.21-1 to install mvapich2-gdr at 2.3.7 on the Cardinal cluster. However, it seems that the source file is not available from the URL listed in the package file: http://mvapich.cse.ohio-state.edu/download/mvapich/spack-mirror/mvapich2-gdr/mvapich2-gdr-2.3.6.tar.gz Is permission required to download the source file, or is there another URL available? ZQ -- Zhi-Qiang You Senior Scientific Applications Engineer Ohio Supercomputer Center (OSC) A member of the Ohio Technology Consortium 1224 Kinnear Road, Columbus, Ohio 43212 Office: (614) 292-8492 ? Fax: (614) 292-7168 zyou at osc.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From lieber.31 at osu.edu Tue Mar 5 20:32:11 2024 From: lieber.31 at osu.edu (Lieber, Matt) Date: Wed, 6 Mar 2024 01:32:11 +0000 Subject: [Mvapich-discuss] Install mvapich2-gdr using Spack In-Reply-To: References: Message-ID: Hello, Since MVAPICH2-GDR is a closed source project there is a different set of instructions for installing through spack. Instructions on how to install one of the buildcaches can be found here https://mvapich.cse.ohio-state.edu/userguide/userguide_spack/#_install_mvapich2_x_gdr Please let me know if you have any other questions. -Matt ________________________________ From: Mvapich-discuss on behalf of You, Zhi-Qiang via Mvapich-discuss Sent: Tuesday, March 5, 2024 7:39 PM To: mvapich-discuss at lists.osu.edu Subject: [Mvapich-discuss] Install mvapich2-gdr using Spack Hello, I am attempting to use Spack 0.21-1 to install mvapich2-gdr at 2.3.7 on the Cardinal cluster. However, it seems that the source file is not available from the URL listed in the package file: http://mvapich.cse.ohio-state.edu/download/mvapich/spack-mirror/mvapich2-gdr/mvapich2-gdr-2.3.6.tar.gz Is permission required to download the source file, or is there another URL available? ZQ -- Zhi-Qiang You Senior Scientific Applications Engineer Ohio Supercomputer Center (OSC) A member of the Ohio Technology Consortium 1224 Kinnear Road, Columbus, Ohio 43212 Office: (614) 292-8492 ? Fax: (614) 292-7168 zyou at osc.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From zyou at osc.edu Thu Mar 7 12:01:15 2024 From: zyou at osc.edu (You, Zhi-Qiang) Date: Thu, 7 Mar 2024 17:01:15 +0000 Subject: [Mvapich-discuss] Install mvapich2-gdr using Spack In-Reply-To: References: Message-ID: Hi Matt, Thank you for the iinformation. I've checked the list of build caches, but unfortunately, I couldn't find the spec we require. At OSC, we're currently using Spack as our software management system for the new cluster. The following is the spec I was attempting to obtain: mvapich2-gdr at 2.3.7+core_direct %intel at 2021.10.0 distribution=mofed5.0 pmi_version=pmi2 process_managers=slurm ^cuda at 12.3.2 The architecture is linux-rhel9-sapphirerapid. I'm uncertain if you're able to build this for us. If not, I can proceed with the RPM request. Thank you, ZQ -- Zhi-Qiang You Senior Scientific Applications Engineer Ohio Supercomputer Center (OSC) A member of the Ohio Technology Consortium 1224 Kinnear Road, Columbus, Ohio 43212 Office: (614) 292-8492 ? Fax: (614) 292-7168 zyou at osc.edu From: Lieber, Matt Date: Tuesday, March 5, 2024 at 8:32?PM To: You, Zhi-Qiang , Announcement about MVAPICH2 (MPI over InfiniBand, RoCE, Omni-Path, iWARP and EFA) Libraries developed at NBCL/OSU Subject: Re: Install mvapich2-gdr using Spack Hello, Since MVAPICH2-GDR is a closed source project there is a different set of instructions for installing through spack. Instructions on how to install one of the buildcaches can be found here https://mvapich.cse.ohio-state.edu/userguide/userguide_spack/#_install_mvapich2_x_gdr Please let me know if you have any other questions. -Matt ________________________________ From: Mvapich-discuss on behalf of You, Zhi-Qiang via Mvapich-discuss Sent: Tuesday, March 5, 2024 7:39 PM To: mvapich-discuss at lists.osu.edu Subject: [Mvapich-discuss] Install mvapich2-gdr using Spack Hello, I am attempting to use Spack 0.21-1 to install mvapich2-gdr at 2.3.7 on the Cardinal cluster. However, it seems that the source file is not available from the URL listed in the package file: http://mvapich.cse.ohio-state.edu/download/mvapich/spack-mirror/mvapich2-gdr/mvapich2-gdr-2.3.6.tar.gz Is permission required to download the source file, or is there another URL available? ZQ -- Zhi-Qiang You Senior Scientific Applications Engineer Ohio Supercomputer Center (OSC) A member of the Ohio Technology Consortium 1224 Kinnear Road, Columbus, Ohio 43212 Office: (614) 292-8492 ? Fax: (614) 292-7168 zyou at osc.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From shineman.5 at osu.edu Fri Mar 8 09:22:17 2024 From: shineman.5 at osu.edu (Shineman, Nat) Date: Fri, 8 Mar 2024 14:22:17 +0000 Subject: [Mvapich-discuss] Process placement on Sapphire rapids In-Reply-To: References: Message-ID: Hi Sam, Can you please try the attached patch on your 2.3.7 code? This should add support for Sapphire Rapids. Please let me know if you have any issues. Thanks, Nat ________________________________ From: Mvapich-discuss on behalf of Khuvis, Samuel via Mvapich-discuss Sent: Monday, February 26, 2024 16:23 To: Mvapich-discuss Subject: Re: [Mvapich-discuss] Process placement on Sapphire rapids Sorry, here is the output. -- Samuel Khuvis Scientific Applications Engineer Ohio Supercomputer Center (OSC) A member of the Ohio Technology Consortium 1224 Kinnear Road, Columbus, Ohio 43212 Office: (614) 292-5178 ? Fax: (614) 292-7168 From: Khuvis, Samuel Date: Monday, February 26, 2024 at 1:21?PM To: Mvapich-discuss Subject: Process placement on Sapphire rapids Hi, We?ve just started doing some testing on our new system, Cardinal, with Sapphire Rapids processors and the process placement seems to be wrong. These tests are for 2.3.7 but we are also working on testing the 3.0ga. It doesn?t look like 2.3.7 lists Sapphire Rapids as a supported architecture. Could this be fixed in a patch? I have attached the output but let me know what additional information you need from us. Thanks, Samuel Khuvis Scientific Applications Engineer Ohio Supercomputer Center (OSC) A member of the Ohio Technology Consortium 1224 Kinnear Road, Columbus, Ohio 43212 Office: (614) 292-5178 ? Fax: (614) 292-7168 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: dane-tuning.patch Type: text/x-patch Size: 700459 bytes Desc: dane-tuning.patch URL: From chuck at ece.cmu.edu Fri Mar 8 17:08:04 2024 From: chuck at ece.cmu.edu (Chuck Cranor) Date: Fri, 8 Mar 2024 17:08:04 -0500 Subject: [Mvapich-discuss] ch3:nemesis mvapich2-2.3.7-1 compile error Message-ID: !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------! hi- configure "--with-device=ch3:nemesis" has compile errors introduced in mvapich2-2.3.7-1 (used to work in mvapich2-2.3.6). The mvapich2 svn trunk from scm.mvapich.cse.ohio-state.edu also fails. I'm compiling on linux ubuntu20, if it matters. There are two errors, both related to changes made to hostname/address handling: -------------------------------------------------------------------- The first error is a typeo in on line 391 of src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c where references a variable "mpierrno" that does not exist in that file: src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c: In function 'GetSockInterfaceAddr': src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c:391:30: error: 'mpierrno' undeclared (first use in this function); did you mean 'mpi_errno'? 391 | MPIR_ERR_CHKANDJUMP2(mpierrno != 0, mpi_errno, MPI_ERR_OTHER, | ^~~~~~~~ I believe this helps (mpi_errno contains the return value from a call to getaddrinfo() which returns zero on success): --- mvapich2_FAIL/src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c 2024-03-08 16:34:00.199724176 -0500 +++ mvapich2_FIX/src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c 2024-03-08 16:45:13.779688980 -0500 @@ -388,7 +388,7 @@ if (!mpi_errno); break; } - MPIR_ERR_CHKANDJUMP2(mpierrno != 0, mpi_errno, MPI_ERR_OTHER, + MPIR_ERR_CHKANDJUMP2(mpi_errno != 0, mpi_errno, MPI_ERR_OTHER, "**getaddrinfo", "**getaddrinfo %s %d", ifname_string, mpi_errno); -------------------------------------------------------------------- The second error is trying to pass a structure as a pointer to the second arg of inet_ntop(int af, void *src, char *dst, socklen_t size): CC src/mpid/ch3/channels/nemesis/netmod/tcp/lib_libmpi_la-tcp_getip.lo src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c: In function 'MPIDI_GetIPInterface': src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c:155:32: error: incompatible type for argument 2 of 'inet_ntop' 155 | inet_ntop(AF_INET, addr, ip, sizeof(ip)); | ^~~~ | | | struct in_addr In file included from src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c:49: /usr/include/arpa/inet.h:64:20: note: expected 'const void * restrict' but argument is of type 'struct in_addr' 64 | extern const char *inet_ntop (int __af, const void *__restrict __cp, | ^~~~~~~~~ the fix is to pass the address of the "struct in_addr addr" to inet_notp() instead of the structure itself: --- mvapich2_FAIL/src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c 2024-03-08 16:34:00.199724176 -0500 +++ mvapich2_FIX/src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c 2024-03-08 16:45:29.971880371 -0500 @@ -152,7 +152,7 @@ addr = ((struct sockaddr_in *) &(ifreq->ifr_addr))->sin_addr; if (dbg_ifname) { char* ip[INET_ADDRSTRLEN]; - inet_ntop(AF_INET, addr, ip, sizeof(ip)); + inet_ntop(AF_INET, &addr, ip, sizeof(ip)); fprintf(stdout, "IPv4 address = %08x (%s)\n", addr.s_addr, ip); } -------------------------------------------------------------------- chuck From subramoni.1 at osu.edu Sat Mar 9 12:12:17 2024 From: subramoni.1 at osu.edu (Subramoni, Hari) Date: Sat, 9 Mar 2024 17:12:17 +0000 Subject: [Mvapich-discuss] ch3:nemesis mvapich2-2.3.7-1 compile error In-Reply-To: References: Message-ID: Hi, Chuck. The ch3:nemesis has been deprecated. Please use the latest MVAPICH 3.0 for best performance and functionality. Thx, Hari. -----Original Message----- From: Mvapich-discuss On Behalf Of Chuck Cranor via Mvapich-discuss Sent: Friday, March 8, 2024 5:08 PM To: mvapich-discuss at lists.osu.edu Subject: [Mvapich-discuss] ch3:nemesis mvapich2-2.3.7-1 compile error !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------! hi- configure "--with-device=ch3:nemesis" has compile errors introduced in mvapich2-2.3.7-1 (used to work in mvapich2-2.3.6). The mvapich2 svn trunk from scm.mvapich.cse.ohio-state.edu also fails. I'm compiling on linux ubuntu20, if it matters. There are two errors, both related to changes made to hostname/address handling: -------------------------------------------------------------------- The first error is a typeo in on line 391 of src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c where references a variable "mpierrno" that does not exist in that file: src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c: In function 'GetSockInterfaceAddr': src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c:391:30: error: 'mpierrno' undeclared (first use in this function); did you mean 'mpi_errno'? 391 | MPIR_ERR_CHKANDJUMP2(mpierrno != 0, mpi_errno, MPI_ERR_OTHER, | ^~~~~~~~ I believe this helps (mpi_errno contains the return value from a call to getaddrinfo() which returns zero on success): --- mvapich2_FAIL/src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c 2024-03-08 16:34:00.199724176 -0500 +++ mvapich2_FIX/src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c 2024-03-08 16:45:13.779688980 -0500 @@ -388,7 +388,7 @@ if (!mpi_errno); break; } - MPIR_ERR_CHKANDJUMP2(mpierrno != 0, mpi_errno, MPI_ERR_OTHER, + MPIR_ERR_CHKANDJUMP2(mpi_errno != 0, mpi_errno, MPI_ERR_OTHER, "**getaddrinfo", "**getaddrinfo %s %d", ifname_string, mpi_errno); -------------------------------------------------------------------- The second error is trying to pass a structure as a pointer to the second arg of inet_ntop(int af, void *src, char *dst, socklen_t size): CC src/mpid/ch3/channels/nemesis/netmod/tcp/lib_libmpi_la-tcp_getip.lo src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c: In function 'MPIDI_GetIPInterface': src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c:155:32: error: incompatible type for argument 2 of 'inet_ntop' 155 | inet_ntop(AF_INET, addr, ip, sizeof(ip)); | ^~~~ | | | struct in_addr In file included from src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c:49: /usr/include/arpa/inet.h:64:20: note: expected 'const void * restrict' but argument is of type 'struct in_addr' 64 | extern const char *inet_ntop (int __af, const void *__restrict __cp, | ^~~~~~~~~ the fix is to pass the address of the "struct in_addr addr" to inet_notp() instead of the structure itself: --- mvapich2_FAIL/src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c 2024-03-08 16:34:00.199724176 -0500 +++ mvapich2_FIX/src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c 2024-03-08 16:45:29.971880371 -0500 @@ -152,7 +152,7 @@ addr = ((struct sockaddr_in *) &(ifreq->ifr_addr))->sin_addr; if (dbg_ifname) { char* ip[INET_ADDRSTRLEN]; - inet_ntop(AF_INET, addr, ip, sizeof(ip)); + inet_ntop(AF_INET, &addr, ip, sizeof(ip)); fprintf(stdout, "IPv4 address = %08x (%s)\n", addr.s_addr, ip); } -------------------------------------------------------------------- chuck _______________________________________________ Mvapich-discuss mailing list Mvapich-discuss at lists.osu.edu https://lists.osu.edu/mailman/listinfo/mvapich-discuss From chuck at ece.cmu.edu Sat Mar 9 12:58:09 2024 From: chuck at ece.cmu.edu (Chuck Cranor) Date: Sat, 9 Mar 2024 12:58:09 -0500 Subject: [Mvapich-discuss] ch3:nemesis mvapich2-2.3.7-1 compile error In-Reply-To: References: Message-ID: !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------! Hi Hari- We've got an older ~600 node Cray connected by an Intel/Qlogic TrueScale 12800-360 Infiniband switch which uses psm via ch3:psm. The MVAPICH download page indicates we should use legacy 2.3.7 instead of 3.0. is that correct? we sometimes use ch3:nemesis in addition to ch3:psm in order to route MPI over ipoib when we have apps that want to use PSM directly via the libpsm_infinipath.so library. be happy to switch to MVAPICH 3.0 if it now works with psm libpsm_infinipath.so. chuck On Sat, Mar 09, 2024 at 05:12:17PM +0000, Subramoni, Hari wrote: > The ch3:nemesis has been deprecated. Please use the latest MVAPICH 3.0 for best performance and functionality. > > Thx, > Hari. From panda at cse.ohio-state.edu Sat Mar 9 14:30:34 2024 From: panda at cse.ohio-state.edu (Panda, Dhabaleswar) Date: Sat, 9 Mar 2024 19:30:34 +0000 Subject: [Mvapich-discuss] ch3:nemesis mvapich2-2.3.7-1 compile error In-Reply-To: References: Message-ID: Hi Chuck, Are you able to install Cornelis OPX library over OFI on your system? If so, you should be able to use the latest MVAPICH 3.0 and MVPICH-Plus 3.0 on your system without any problem. Thanks, DK ________________________________________ From: Mvapich-discuss on behalf of Chuck Cranor via Mvapich-discuss Sent: Saturday, March 9, 2024 12:58 PM To: Subramoni, Hari Cc: Announcement about MVAPICH2 (MPI over InfiniBand, RoCE, Omni-Path, iWARP and EFA) Libraries developed at NBCL/OSU Subject: Re: [Mvapich-discuss] ch3:nemesis mvapich2-2.3.7-1 compile error !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------! Hi Hari- We've got an older ~600 node Cray connected by an Intel/Qlogic TrueScale 12800-360 Infiniband switch which uses psm via ch3:psm. The MVAPICH download page indicates we should use legacy 2.3.7 instead of 3.0. is that correct? we sometimes use ch3:nemesis in addition to ch3:psm in order to route MPI over ipoib when we have apps that want to use PSM directly via the libpsm_infinipath.so library. be happy to switch to MVAPICH 3.0 if it now works with psm libpsm_infinipath.so. chuck On Sat, Mar 09, 2024 at 05:12:17PM +0000, Subramoni, Hari wrote: > The ch3:nemesis has been deprecated. Please use the latest MVAPICH 3.0 for best performance and functionality. > > Thx, > Hari. _______________________________________________ Mvapich-discuss mailing list Mvapich-discuss at lists.osu.edu https://lists.osu.edu/mailman/listinfo/mvapich-discuss From panda at cse.ohio-state.edu Sat Mar 9 14:56:05 2024 From: panda at cse.ohio-state.edu (Panda, Dhabaleswar) Date: Sat, 9 Mar 2024 19:56:05 +0000 Subject: [Mvapich-discuss] Announcing the release of MVAPICH-Plus 3.0 GA In-Reply-To: References: Message-ID: The MVAPICH team is pleased to announce the release of MVAPICH-Plus 3.0 GA. The new MVAPICH-Plus series is an advanced version of the MVAPICH MPI library. It is targeted to support unified MVAPICH2-GDR and MVAPICH2-X features. It is also targeted to provide optimized support for modern platforms (CPU, GPU, and interconnects) for HPC, Deep Learning, Machine Learning, Big Data and Data Science applications. The major features and enhancements available in MVAPICH-Plus 3.0 GA are as follows: - Based on MVAPICH 3.0 - Support for various high-performance communication fabrics - InfiniBand, Slingshot-10/11, Omni-Path, OPX, RoCE, and Ethernet - Support naive CPU staging approach for collectives for small messages - Tune naive limits for the following systems - Frontier at OLCF, Pitzer at OSC, Owens at OSC, Ascend at OSC, Frontera at TACC, Lonestar6 at TACC, ThetaGPU at ALCF, Polaris at ALCF, Tioga at LLNL - Initial support for blocking collectives on NVIDIA and AMD GPUs - Allgather, Allgatherv, Allreduce, Alltoall, Alltoallv, Bcast, Gather, Gatherv, Reduce, Reduce_scatter, Scatter, Scatterv, Reduce_local, Reduce_scatter_block - Initial support for non-blocking GPU collectives on NVIDIA and AMD GPUs - Iallgather, Iallgatherv, Iallreduce, Ialltoall, Ialltoallv, Ibcast, Igather, Igatherv, Ireduce, Ireduce_scatter, Iscatter, Iscatterv - Enhanced collective and pt2pt tuning for NVIDIA Grace-Hopper systems - Enhanced collective tuning for NVIDIA V100, A100, H100 GPUs - Enhanced collective tuning for AMD MI100, and MI250x GPUs - Enhanced support for blocking and non-blocking GPU to GPU point-to-point operations on NVIDIA and AMD GPUs taking advantage of: - NVIDIA GDRCopy, AMD LargeBar support - CUDA and ROCM IPC support - Enhanced CPU tuning on various HPC systems and architectures - Stampede3 at TACC, Frontier at OLCF, Lonestar6 at TACC - AMD Rome, AMD Millan, Intel Sapphire Rapids - Tested with - Various HPC applications, mini-applications, and benchmarks - HiDL, MPI4DL, and MCR-DL packages for MPI-driven distributed training - MPI4cuML (a custom cuML package with MPI support) for scalable machine learning - Tested with CUDA <= 12.3 - Tested with ROCM <= 5.6.0 For downloading MVAPICH-Plus 3.0 GA library and associated user guide, please visit the following URL: http://mvapich.cse.ohio-state.edu All questions, feedback, bug reports, hints for performance tuning, patches, and enhancements are welcome. Please post it to the mvapich-discuss mailing list (mvapich-discuss at lists.osu.edu). Thanks, The MVAPICH Team PS: We are also happy to inform that the number of organizations using MVAPICH libraries (and registered at the MVAPICH site) has crossed 3,375 worldwide (in 91 countries). The number of downloads from the MVAPICH site has crossed 1,765,000 (1.765 million). The MVAPICH team would like to thank all its users and organizations!! From duzhuqi at qq.com Sat Mar 9 21:46:27 2024 From: duzhuqi at qq.com (=?gb18030?B?tsXn+Q==?=) Date: Sun, 10 Mar 2024 10:46:27 +0800 Subject: [Mvapich-discuss] When will MVAPICH fully support the MPI4 standard? Message-ID: Dear sir Currently, MPICH has fully supported the MPI4 standard. When will MVAPICH fully support the MPI4 standard? ?? duzhuqi at qq.com   -------------- next part -------------- An HTML attachment was scrubbed... URL: From panda at cse.ohio-state.edu Sun Mar 10 10:12:20 2024 From: panda at cse.ohio-state.edu (Panda, Dhabaleswar) Date: Sun, 10 Mar 2024 14:12:20 +0000 Subject: [Mvapich-discuss] When will MVAPICH fully support the MPI4 standard? In-Reply-To: References: Message-ID: It is being worked out and will be available soon. Please stay tuned. DK ________________________________________ From: Mvapich-discuss on behalf of ?? via Mvapich-discuss Sent: Saturday, March 9, 2024 9:46 PM To: mvapich-discuss Subject: [Mvapich-discuss] When will MVAPICH fully support the MPI4 standard? Dear sir Currently, MPICH has fully supported the MPI4 standard. When will MVAPICH fully support the MPI4 standard? ?? duzhuqi@?qq.?com ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. Report Suspicious ZjQcmQRYFpfptBannerEnd Dear sir Currently, MPICH has fully supported the MPI4 standard. When will MVAPICH fully support the MPI4 standard? [https://res.mail.qq.com/zh_CN/htmledition/images/rss/male.gif?rand=1617349242] ?? duzhuqi at qq.com From subramoni.1 at osu.edu Mon Mar 11 11:06:58 2024 From: subramoni.1 at osu.edu (Subramoni, Hari) Date: Mon, 11 Mar 2024 15:06:58 +0000 Subject: [Mvapich-discuss] Announcing the Release of OSU InfiniBand Analysis and Monitoring (INAM) Tool v1.1 Message-ID: The MVAPICH team is pleased to announce the release of OSU InfiniBand Network Analysis and Monitoring (INAM) Tool v1.1. OSU INAM monitors InfiniBand clusters in real time by querying various subnet management entities in the network. It is also capable of interacting with the MVAPICH2-X software stack to gain insights into the communication pattern of the application and classify the data transferred into Point-to-Point, Collective and Remote Memory Access (RMA). OSU INAM can also remotely monitor several parameters of MPI processes in conjunction with MVAPICH2-X. OSU INAM v1.1 (03/11/2024) * Major Features & Enhancements (since 1.0): - Support for ClickHouse Database to support real-time querying and visualization of very large HPC clusters (20,000+ nodes) - Support for up to 64 parallel insertions for multiple sources of profiling data - Support for up to 64 concurrent users to access OSU INAM with sub-second latency by using ClickHouse - Improved stability of OSU INAM operation - Reduced disk space by using ClickHouse - Change Default Bulk Insertion Size based on Database used to improve real-time view of network traffic - Extending notifications to support multiple criteria * Bug fixes - Fix issues loading certain switch nickname files - Fix a bug for showing link level information for live jobs For downloading OSU INAM v0.9.8 and associated user guide, please visit the following URL: http://mvapich.cse.ohio-state.edu All questions, feedback, bug reports, hints for performance tuning, and enhancements are welcome. Please post it to the mvapich-discuss mailing list (mvapich-discuss at lists.osu.edu). Thanks, The MVAPICH Team PS: We are also happy to inform that the number of organizations using MVAPICH libraries (and registered at the MVAPICH site) has crossed 3,375 worldwide (in 91 countries). The number of downloads from the MVAPICH site has crossed 1,765,000 (1.765 million). The MVAPICH team would like to thank all its users and organizations!! -------------- next part -------------- An HTML attachment was scrubbed... URL: From chuck at ece.cmu.edu Mon Mar 11 13:54:57 2024 From: chuck at ece.cmu.edu (Chuck Cranor) Date: Mon, 11 Mar 2024 13:54:57 -0400 Subject: [Mvapich-discuss] ch3:nemesis mvapich2-2.3.7-1 compile error In-Reply-To: References: Message-ID: !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------! hi- I don't think so. OPX appears to be targeting OmniPath adaptors which show up as /dev/hfi1 (e.g. see libfabric's prov/opx/include/opa_service.h). The intent appears to be to replace the PSM2 library with the OPX provider under the libfabric API. We have the previous pre-OmniPath version of the adaptor (the intel/qlogic qle7300 series) which shows up as /dev/ipath0 and uses the PSM library (instead of PSM2). chuck On Sat, Mar 09, 2024 at 07:30:34PM +0000, Panda, Dhabaleswar wrote: > Hi Chuck, > > Are you able to install Cornelis OPX library over OFI on your system? If so, you should be able to use the latest MVAPICH 3.0 and MVPICH-Plus 3.0 on your system without any problem. > > Thanks, From panda at cse.ohio-state.edu Mon Mar 11 18:41:45 2024 From: panda at cse.ohio-state.edu (Panda, Dhabaleswar K.) Date: Mon, 11 Mar 2024 22:41:45 +0000 Subject: [Mvapich-discuss] Save the Dates for MUG '24 Conference Message-ID: We are happy to indicate that the 12th annual MVAPICH User Group (MUG) conference will take place in Columbus, OH, USA during August 19-21, 2024. It will be an in-person event with an option for remote attendance. Please save the dates and stay tuned for future announcements!! More details on the conference are available from http://mug.mvapich.cse.ohio-state.edu/ Thanks, The MUG '24 Organizers PS: Interested in getting announcements related to the MUG events? Please subscribe to the MUG Conference Mailing list (available from the MUG conference page). From antoine.morvan at eviden.com Mon Mar 18 05:53:58 2024 From: antoine.morvan at eviden.com (ANTOINE MORVAN) Date: Mon, 18 Mar 2024 09:53:58 +0000 Subject: [Mvapich-discuss] Configure issue: enables MPI 4 suite when detecting MPI_Session_Init Message-ID: !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------! Hello, I was playing with the OSU benchmark 7.3 with Intel OneAPI & Intel MPI 2024 version. By default, the build is failing with some undefined symbols: make[4]: *** Waiting for unfinished jobs.... osu_bcast_persistent.o: In function `main': osu_bcast_persistent.c:(.text+0x3f3): undefined reference to `MPI_Bcast_init' osu_reduce_persistent.o: In function `main': osu_reduce_persistent.c:(.text+0x41d): undefined reference to `MPI_Reduce_init' osu_allreduce_persistent.o: In function `main': osu_allreduce_persistent.c:(.text+0x41b): undefined reference to `MPI_Allreduce_init' osu_alltoall_persistent.o: In function `main': osu_alltoall_persistent.c:(.text+0x4c5): undefined reference to `MPI_Alltoall_init' osu_gather_persistent.o: In function `main': osu_gather_persistent.c:(.text+0x43b): undefined reference to `MPI_Gather_init' osu_scatter_persistent.o: In function `main': osu_scatter_persistent.c:(.text+0x4a6): undefined reference to `MPI_Scatter_init' osu_scatter_persistent.c:(.text+0x4f3): undefined reference to `MPI_Scatter_init' After further investigation, it appears that the configure script of the OSU benchmarks is enabling the MPI 4 benchmarks when detecting MPI_Session_Init available in the MPI implementation. The Intel MPI 2021 update 11 bundled with OneAPI 2024 does provide an implementation for MPI sessions (https://urldefense.com/v3/__https://www.intel.com/content/www/us/en/developer/articles/release-notes/mpi-library-release-notes.html__;!!KGKeukY!2fwDrANnfX4-Ec39ASWOQKXsoFaVuD_wWBcF5egWOKZ8PB4qviFiBZfuBdyQKRpeo50eSNprVVLqxdIhKsnN3RBFcZ-EVee7tmAN$ ) but does not exposes all of the MPI 4 API. So, relying on just the sessions would not be enough to ensure the MPI implementation is MPI 4 ready. On top of this, the configure help clearly states that MPI 4 benchmarks are disabled bby default : $OSU_SRC/configure --help [...] --enable-mpi4 Enables MPI4 support and features for benchmarks (default is false) So either the help is not up to date (it does enable the MPI 4 depending on some configure logics); or the configure script is faulty. Best regards, Antoine Morvan?(He/Him) HPC Application Expert?- CEPP (Center for Excellence in Performance Programming) - HPC, AI & Quantum GBL M: +33 (0) 6 43 12 98 21 12 F RUE DU PATIS TATELIN?- 35700 Rennes?- France eviden.com From purum5548 at konkuk.ac.kr Tue Mar 19 00:01:34 2024 From: purum5548 at konkuk.ac.kr (=?ks_c_5601-1987?B?vK3Hqrin?=) Date: Tue, 19 Mar 2024 04:01:34 +0000 Subject: [Mvapich-discuss] [LiMIC2] Hugepage support patch for LiMIC2(LiMIC2-0.5.8) Message-ID: Dear MVAPICH team. Hi, I want to submit updated LiMIC2 module.(LiMIC2-0.5.8) This module includes below implemented functions * work with transparent huge page(THP) and hugetlbfs * reduce per page overhead(get/release pages, mapping pages) with hugepages * support for newer kernel This module is implementation of this paper. Performance measurement and implementation details are included in this paper. https://urldefense.com/v3/__https://dl.acm.org/doi/10.1145/3127024.3127035__;!!KGKeukY!2wW6S05kswvOZwbRrIlDVyMmMbrikCbUqjJIHghsteh0780hZf3A7eZbquUyAeOWtorCRVe5LSfYJZhg5Imqs5OnFWV7xF8v2A$ Thanks, Purum. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: limic2-0.5.8.tar Type: application/x-tar Size: 666440 bytes Desc: limic2-0.5.8.tar URL: From subramoni.1 at osu.edu Tue Mar 19 17:55:32 2024 From: subramoni.1 at osu.edu (Subramoni, Hari) Date: Tue, 19 Mar 2024 21:55:32 +0000 Subject: [Mvapich-discuss] [LiMIC2] Hugepage support patch for LiMIC2(LiMIC2-0.5.8) In-Reply-To: References: Message-ID: Hi, Purum. Thanks for the patch. We will review it and take it in with an acknowledgement to you. Best, Hari. From: Mvapich-discuss On Behalf Of ??? via Mvapich-discuss Sent: Tuesday, March 19, 2024 12:02 AM To: Mvapich-discuss at lists.osu.edu Cc: jinh at konkuk.ac.kr Subject: [Mvapich-discuss] [LiMIC2] Hugepage support patch for LiMIC2(LiMIC2-0.5.8) Dear MVAPICH team. Hi, I want to submit updated LiMIC2 module.?(LiMIC2-0.?5.?8) This module includes below implemented functions work with transparent huge page(THP) and hugetlbfs reduce per page overhead(get/release pages, mapping pages) with ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. Report Suspicious ? ZjQcmQRYFpfptBannerEnd Dear MVAPICH team. Hi, I want to submit updated LiMIC2 module.(LiMIC2-0.5.8) This module includes below implemented functions * work with transparent huge page(THP) and hugetlbfs * reduce per page overhead(get/release pages, mapping pages) with hugepages * support for newer kernel This module is implementation of this paper. Performance measurement and implementation details are included in this paper. https://dl.acm.org/doi/10.1145/3127024.3127035 Thanks, Purum. -------------- next part -------------- An HTML attachment was scrubbed... URL: From subramoni.1 at osu.edu Tue Mar 19 17:57:08 2024 From: subramoni.1 at osu.edu (Subramoni, Hari) Date: Tue, 19 Mar 2024 21:57:08 +0000 Subject: [Mvapich-discuss] Configure issue: enables MPI 4 suite when detecting MPI_Session_Init In-Reply-To: References: Message-ID: Hi, Antoine. Many thanks for reporting this issue to us. Your observation is correct. We will look into this and fix it. It should be available with the next release of OMB. Best, Hari. -----Original Message----- From: Mvapich-discuss On Behalf Of ANTOINE MORVAN via Mvapich-discuss Sent: Monday, March 18, 2024 5:54 AM To: mvapich-discuss at lists.osu.edu Subject: [Mvapich-discuss] Configure issue: enables MPI 4 suite when detecting MPI_Session_Init !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------! Hello, I was playing with the OSU benchmark 7.3 with Intel OneAPI & Intel MPI 2024 version. By default, the build is failing with some undefined symbols: make[4]: *** Waiting for unfinished jobs.... osu_bcast_persistent.o: In function `main': osu_bcast_persistent.c:(.text+0x3f3): undefined reference to `MPI_Bcast_init' osu_reduce_persistent.o: In function `main': osu_reduce_persistent.c:(.text+0x41d): undefined reference to `MPI_Reduce_init' osu_allreduce_persistent.o: In function `main': osu_allreduce_persistent.c:(.text+0x41b): undefined reference to `MPI_Allreduce_init' osu_alltoall_persistent.o: In function `main': osu_alltoall_persistent.c:(.text+0x4c5): undefined reference to `MPI_Alltoall_init' osu_gather_persistent.o: In function `main': osu_gather_persistent.c:(.text+0x43b): undefined reference to `MPI_Gather_init' osu_scatter_persistent.o: In function `main': osu_scatter_persistent.c:(.text+0x4a6): undefined reference to `MPI_Scatter_init' osu_scatter_persistent.c:(.text+0x4f3): undefined reference to `MPI_Scatter_init' After further investigation, it appears that the configure script of the OSU benchmarks is enabling the MPI 4 benchmarks when detecting MPI_Session_Init available in the MPI implementation. The Intel MPI 2021 update 11 bundled with OneAPI 2024 does provide an implementation for MPI sessions (https://urldefense.com/v3/__https://www.intel.com/content/www/us/en/developer/articles/release-notes/mpi-library-release-notes.html__;!!KGKeukY!2fwDrANnfX4-Ec39ASWOQKXsoFaVuD_wWBcF5egWOKZ8PB4qviFiBZfuBdyQKRpeo50eSNprVVLqxdIhKsnN3RBFcZ-EVee7tmAN$ ) but does not exposes all of the MPI 4 API. So, relying on just the sessions would not be enough to ensure the MPI implementation is MPI 4 ready. On top of this, the configure help clearly states that MPI 4 benchmarks are disabled bby default : $OSU_SRC/configure --help [...] --enable-mpi4 Enables MPI4 support and features for benchmarks (default is false) So either the help is not up to date (it does enable the MPI 4 depending on some configure logics); or the configure script is faulty. Best regards, Antoine Morvan?(He/Him) HPC Application Expert?- CEPP (Center for Excellence in Performance Programming) - HPC, AI & Quantum GBL M: +33 (0) 6 43 12 98 21 12 F RUE DU PATIS TATELIN?- 35700 Rennes?- France eviden.com _______________________________________________ Mvapich-discuss mailing list Mvapich-discuss at lists.osu.edu https://lists.osu.edu/mailman/listinfo/mvapich-discuss From Mahdieh.Ghazimirsaeed at amd.com Wed Mar 27 12:56:26 2024 From: Mahdieh.Ghazimirsaeed at amd.com (Ghazimirsaeed, Mahdieh) Date: Wed, 27 Mar 2024 16:56:26 +0000 Subject: [Mvapich-discuss] Fix for OMB installation with ROCm 6 Message-ID: [AMD Official Use Only - General] Hi, The followingg error shows up when building OSU microbenchmark (7.3) with rocm6.0.2: cannot include hip/hip_runtime_api.h To resolve this issue, I grep?ed for __HIP_PLATFORM_HCC__ and replaced it with __HIP_PLATFORM_AMD__. Hope this is fixed in upcoming OMB releases. Best, Mahdieh -------------- next part -------------- An HTML attachment was scrubbed... URL: From zyou at osc.edu Tue Mar 5 19:39:41 2024 From: zyou at osc.edu (You, Zhi-Qiang) Date: Wed, 6 Mar 2024 00:39:41 +0000 Subject: [Mvapich-discuss] Install mvapich2-gdr using Spack Message-ID: Hello, I am attempting to use Spack 0.21-1 to install mvapich2-gdr at 2.3.7 on the Cardinal cluster. However, it seems that the source file is not available from the URL listed in the package file: http://mvapich.cse.ohio-state.edu/download/mvapich/spack-mirror/mvapich2-gdr/mvapich2-gdr-2.3.6.tar.gz Is permission required to download the source file, or is there another URL available? ZQ -- Zhi-Qiang You Senior Scientific Applications Engineer Ohio Supercomputer Center (OSC) A member of the Ohio Technology Consortium 1224 Kinnear Road, Columbus, Ohio 43212 Office: (614) 292-8492 ? Fax: (614) 292-7168 zyou at osc.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From lieber.31 at osu.edu Tue Mar 5 20:32:11 2024 From: lieber.31 at osu.edu (Lieber, Matt) Date: Wed, 6 Mar 2024 01:32:11 +0000 Subject: [Mvapich-discuss] Install mvapich2-gdr using Spack In-Reply-To: References: Message-ID: Hello, Since MVAPICH2-GDR is a closed source project there is a different set of instructions for installing through spack. Instructions on how to install one of the buildcaches can be found here https://mvapich.cse.ohio-state.edu/userguide/userguide_spack/#_install_mvapich2_x_gdr Please let me know if you have any other questions. -Matt ________________________________ From: Mvapich-discuss on behalf of You, Zhi-Qiang via Mvapich-discuss Sent: Tuesday, March 5, 2024 7:39 PM To: mvapich-discuss at lists.osu.edu Subject: [Mvapich-discuss] Install mvapich2-gdr using Spack Hello, I am attempting to use Spack 0.21-1 to install mvapich2-gdr at 2.3.7 on the Cardinal cluster. However, it seems that the source file is not available from the URL listed in the package file: http://mvapich.cse.ohio-state.edu/download/mvapich/spack-mirror/mvapich2-gdr/mvapich2-gdr-2.3.6.tar.gz Is permission required to download the source file, or is there another URL available? ZQ -- Zhi-Qiang You Senior Scientific Applications Engineer Ohio Supercomputer Center (OSC) A member of the Ohio Technology Consortium 1224 Kinnear Road, Columbus, Ohio 43212 Office: (614) 292-8492 ? Fax: (614) 292-7168 zyou at osc.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From zyou at osc.edu Thu Mar 7 12:01:15 2024 From: zyou at osc.edu (You, Zhi-Qiang) Date: Thu, 7 Mar 2024 17:01:15 +0000 Subject: [Mvapich-discuss] Install mvapich2-gdr using Spack In-Reply-To: References: Message-ID: Hi Matt, Thank you for the iinformation. I've checked the list of build caches, but unfortunately, I couldn't find the spec we require. At OSC, we're currently using Spack as our software management system for the new cluster. The following is the spec I was attempting to obtain: mvapich2-gdr at 2.3.7+core_direct %intel at 2021.10.0 distribution=mofed5.0 pmi_version=pmi2 process_managers=slurm ^cuda at 12.3.2 The architecture is linux-rhel9-sapphirerapid. I'm uncertain if you're able to build this for us. If not, I can proceed with the RPM request. Thank you, ZQ -- Zhi-Qiang You Senior Scientific Applications Engineer Ohio Supercomputer Center (OSC) A member of the Ohio Technology Consortium 1224 Kinnear Road, Columbus, Ohio 43212 Office: (614) 292-8492 ? Fax: (614) 292-7168 zyou at osc.edu From: Lieber, Matt Date: Tuesday, March 5, 2024 at 8:32?PM To: You, Zhi-Qiang , Announcement about MVAPICH2 (MPI over InfiniBand, RoCE, Omni-Path, iWARP and EFA) Libraries developed at NBCL/OSU Subject: Re: Install mvapich2-gdr using Spack Hello, Since MVAPICH2-GDR is a closed source project there is a different set of instructions for installing through spack. Instructions on how to install one of the buildcaches can be found here https://mvapich.cse.ohio-state.edu/userguide/userguide_spack/#_install_mvapich2_x_gdr Please let me know if you have any other questions. -Matt ________________________________ From: Mvapich-discuss on behalf of You, Zhi-Qiang via Mvapich-discuss Sent: Tuesday, March 5, 2024 7:39 PM To: mvapich-discuss at lists.osu.edu Subject: [Mvapich-discuss] Install mvapich2-gdr using Spack Hello, I am attempting to use Spack 0.21-1 to install mvapich2-gdr at 2.3.7 on the Cardinal cluster. However, it seems that the source file is not available from the URL listed in the package file: http://mvapich.cse.ohio-state.edu/download/mvapich/spack-mirror/mvapich2-gdr/mvapich2-gdr-2.3.6.tar.gz Is permission required to download the source file, or is there another URL available? ZQ -- Zhi-Qiang You Senior Scientific Applications Engineer Ohio Supercomputer Center (OSC) A member of the Ohio Technology Consortium 1224 Kinnear Road, Columbus, Ohio 43212 Office: (614) 292-8492 ? Fax: (614) 292-7168 zyou at osc.edu -------------- next part -------------- An HTML attachment was scrubbed... URL: From shineman.5 at osu.edu Fri Mar 8 09:22:17 2024 From: shineman.5 at osu.edu (Shineman, Nat) Date: Fri, 8 Mar 2024 14:22:17 +0000 Subject: [Mvapich-discuss] Process placement on Sapphire rapids In-Reply-To: References: Message-ID: Hi Sam, Can you please try the attached patch on your 2.3.7 code? This should add support for Sapphire Rapids. Please let me know if you have any issues. Thanks, Nat ________________________________ From: Mvapich-discuss on behalf of Khuvis, Samuel via Mvapich-discuss Sent: Monday, February 26, 2024 16:23 To: Mvapich-discuss Subject: Re: [Mvapich-discuss] Process placement on Sapphire rapids Sorry, here is the output. -- Samuel Khuvis Scientific Applications Engineer Ohio Supercomputer Center (OSC) A member of the Ohio Technology Consortium 1224 Kinnear Road, Columbus, Ohio 43212 Office: (614) 292-5178 ? Fax: (614) 292-7168 From: Khuvis, Samuel Date: Monday, February 26, 2024 at 1:21?PM To: Mvapich-discuss Subject: Process placement on Sapphire rapids Hi, We?ve just started doing some testing on our new system, Cardinal, with Sapphire Rapids processors and the process placement seems to be wrong. These tests are for 2.3.7 but we are also working on testing the 3.0ga. It doesn?t look like 2.3.7 lists Sapphire Rapids as a supported architecture. Could this be fixed in a patch? I have attached the output but let me know what additional information you need from us. Thanks, Samuel Khuvis Scientific Applications Engineer Ohio Supercomputer Center (OSC) A member of the Ohio Technology Consortium 1224 Kinnear Road, Columbus, Ohio 43212 Office: (614) 292-5178 ? Fax: (614) 292-7168 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: dane-tuning.patch Type: text/x-patch Size: 700459 bytes Desc: dane-tuning.patch URL: From chuck at ece.cmu.edu Fri Mar 8 17:08:04 2024 From: chuck at ece.cmu.edu (Chuck Cranor) Date: Fri, 8 Mar 2024 17:08:04 -0500 Subject: [Mvapich-discuss] ch3:nemesis mvapich2-2.3.7-1 compile error Message-ID: !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------! hi- configure "--with-device=ch3:nemesis" has compile errors introduced in mvapich2-2.3.7-1 (used to work in mvapich2-2.3.6). The mvapich2 svn trunk from scm.mvapich.cse.ohio-state.edu also fails. I'm compiling on linux ubuntu20, if it matters. There are two errors, both related to changes made to hostname/address handling: -------------------------------------------------------------------- The first error is a typeo in on line 391 of src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c where references a variable "mpierrno" that does not exist in that file: src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c: In function 'GetSockInterfaceAddr': src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c:391:30: error: 'mpierrno' undeclared (first use in this function); did you mean 'mpi_errno'? 391 | MPIR_ERR_CHKANDJUMP2(mpierrno != 0, mpi_errno, MPI_ERR_OTHER, | ^~~~~~~~ I believe this helps (mpi_errno contains the return value from a call to getaddrinfo() which returns zero on success): --- mvapich2_FAIL/src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c 2024-03-08 16:34:00.199724176 -0500 +++ mvapich2_FIX/src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c 2024-03-08 16:45:13.779688980 -0500 @@ -388,7 +388,7 @@ if (!mpi_errno); break; } - MPIR_ERR_CHKANDJUMP2(mpierrno != 0, mpi_errno, MPI_ERR_OTHER, + MPIR_ERR_CHKANDJUMP2(mpi_errno != 0, mpi_errno, MPI_ERR_OTHER, "**getaddrinfo", "**getaddrinfo %s %d", ifname_string, mpi_errno); -------------------------------------------------------------------- The second error is trying to pass a structure as a pointer to the second arg of inet_ntop(int af, void *src, char *dst, socklen_t size): CC src/mpid/ch3/channels/nemesis/netmod/tcp/lib_libmpi_la-tcp_getip.lo src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c: In function 'MPIDI_GetIPInterface': src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c:155:32: error: incompatible type for argument 2 of 'inet_ntop' 155 | inet_ntop(AF_INET, addr, ip, sizeof(ip)); | ^~~~ | | | struct in_addr In file included from src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c:49: /usr/include/arpa/inet.h:64:20: note: expected 'const void * restrict' but argument is of type 'struct in_addr' 64 | extern const char *inet_ntop (int __af, const void *__restrict __cp, | ^~~~~~~~~ the fix is to pass the address of the "struct in_addr addr" to inet_notp() instead of the structure itself: --- mvapich2_FAIL/src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c 2024-03-08 16:34:00.199724176 -0500 +++ mvapich2_FIX/src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c 2024-03-08 16:45:29.971880371 -0500 @@ -152,7 +152,7 @@ addr = ((struct sockaddr_in *) &(ifreq->ifr_addr))->sin_addr; if (dbg_ifname) { char* ip[INET_ADDRSTRLEN]; - inet_ntop(AF_INET, addr, ip, sizeof(ip)); + inet_ntop(AF_INET, &addr, ip, sizeof(ip)); fprintf(stdout, "IPv4 address = %08x (%s)\n", addr.s_addr, ip); } -------------------------------------------------------------------- chuck From subramoni.1 at osu.edu Sat Mar 9 12:12:17 2024 From: subramoni.1 at osu.edu (Subramoni, Hari) Date: Sat, 9 Mar 2024 17:12:17 +0000 Subject: [Mvapich-discuss] ch3:nemesis mvapich2-2.3.7-1 compile error In-Reply-To: References: Message-ID: Hi, Chuck. The ch3:nemesis has been deprecated. Please use the latest MVAPICH 3.0 for best performance and functionality. Thx, Hari. -----Original Message----- From: Mvapich-discuss On Behalf Of Chuck Cranor via Mvapich-discuss Sent: Friday, March 8, 2024 5:08 PM To: mvapich-discuss at lists.osu.edu Subject: [Mvapich-discuss] ch3:nemesis mvapich2-2.3.7-1 compile error !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------! hi- configure "--with-device=ch3:nemesis" has compile errors introduced in mvapich2-2.3.7-1 (used to work in mvapich2-2.3.6). The mvapich2 svn trunk from scm.mvapich.cse.ohio-state.edu also fails. I'm compiling on linux ubuntu20, if it matters. There are two errors, both related to changes made to hostname/address handling: -------------------------------------------------------------------- The first error is a typeo in on line 391 of src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c where references a variable "mpierrno" that does not exist in that file: src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c: In function 'GetSockInterfaceAddr': src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c:391:30: error: 'mpierrno' undeclared (first use in this function); did you mean 'mpi_errno'? 391 | MPIR_ERR_CHKANDJUMP2(mpierrno != 0, mpi_errno, MPI_ERR_OTHER, | ^~~~~~~~ I believe this helps (mpi_errno contains the return value from a call to getaddrinfo() which returns zero on success): --- mvapich2_FAIL/src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c 2024-03-08 16:34:00.199724176 -0500 +++ mvapich2_FIX/src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_init.c 2024-03-08 16:45:13.779688980 -0500 @@ -388,7 +388,7 @@ if (!mpi_errno); break; } - MPIR_ERR_CHKANDJUMP2(mpierrno != 0, mpi_errno, MPI_ERR_OTHER, + MPIR_ERR_CHKANDJUMP2(mpi_errno != 0, mpi_errno, MPI_ERR_OTHER, "**getaddrinfo", "**getaddrinfo %s %d", ifname_string, mpi_errno); -------------------------------------------------------------------- The second error is trying to pass a structure as a pointer to the second arg of inet_ntop(int af, void *src, char *dst, socklen_t size): CC src/mpid/ch3/channels/nemesis/netmod/tcp/lib_libmpi_la-tcp_getip.lo src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c: In function 'MPIDI_GetIPInterface': src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c:155:32: error: incompatible type for argument 2 of 'inet_ntop' 155 | inet_ntop(AF_INET, addr, ip, sizeof(ip)); | ^~~~ | | | struct in_addr In file included from src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c:49: /usr/include/arpa/inet.h:64:20: note: expected 'const void * restrict' but argument is of type 'struct in_addr' 64 | extern const char *inet_ntop (int __af, const void *__restrict __cp, | ^~~~~~~~~ the fix is to pass the address of the "struct in_addr addr" to inet_notp() instead of the structure itself: --- mvapich2_FAIL/src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c 2024-03-08 16:34:00.199724176 -0500 +++ mvapich2_FIX/src/mpid/ch3/channels/nemesis/netmod/tcp/tcp_getip.c 2024-03-08 16:45:29.971880371 -0500 @@ -152,7 +152,7 @@ addr = ((struct sockaddr_in *) &(ifreq->ifr_addr))->sin_addr; if (dbg_ifname) { char* ip[INET_ADDRSTRLEN]; - inet_ntop(AF_INET, addr, ip, sizeof(ip)); + inet_ntop(AF_INET, &addr, ip, sizeof(ip)); fprintf(stdout, "IPv4 address = %08x (%s)\n", addr.s_addr, ip); } -------------------------------------------------------------------- chuck _______________________________________________ Mvapich-discuss mailing list Mvapich-discuss at lists.osu.edu https://lists.osu.edu/mailman/listinfo/mvapich-discuss From chuck at ece.cmu.edu Sat Mar 9 12:58:09 2024 From: chuck at ece.cmu.edu (Chuck Cranor) Date: Sat, 9 Mar 2024 12:58:09 -0500 Subject: [Mvapich-discuss] ch3:nemesis mvapich2-2.3.7-1 compile error In-Reply-To: References: Message-ID: !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------! Hi Hari- We've got an older ~600 node Cray connected by an Intel/Qlogic TrueScale 12800-360 Infiniband switch which uses psm via ch3:psm. The MVAPICH download page indicates we should use legacy 2.3.7 instead of 3.0. is that correct? we sometimes use ch3:nemesis in addition to ch3:psm in order to route MPI over ipoib when we have apps that want to use PSM directly via the libpsm_infinipath.so library. be happy to switch to MVAPICH 3.0 if it now works with psm libpsm_infinipath.so. chuck On Sat, Mar 09, 2024 at 05:12:17PM +0000, Subramoni, Hari wrote: > The ch3:nemesis has been deprecated. Please use the latest MVAPICH 3.0 for best performance and functionality. > > Thx, > Hari. From panda at cse.ohio-state.edu Sat Mar 9 14:30:34 2024 From: panda at cse.ohio-state.edu (Panda, Dhabaleswar) Date: Sat, 9 Mar 2024 19:30:34 +0000 Subject: [Mvapich-discuss] ch3:nemesis mvapich2-2.3.7-1 compile error In-Reply-To: References: Message-ID: Hi Chuck, Are you able to install Cornelis OPX library over OFI on your system? If so, you should be able to use the latest MVAPICH 3.0 and MVPICH-Plus 3.0 on your system without any problem. Thanks, DK ________________________________________ From: Mvapich-discuss on behalf of Chuck Cranor via Mvapich-discuss Sent: Saturday, March 9, 2024 12:58 PM To: Subramoni, Hari Cc: Announcement about MVAPICH2 (MPI over InfiniBand, RoCE, Omni-Path, iWARP and EFA) Libraries developed at NBCL/OSU Subject: Re: [Mvapich-discuss] ch3:nemesis mvapich2-2.3.7-1 compile error !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------! Hi Hari- We've got an older ~600 node Cray connected by an Intel/Qlogic TrueScale 12800-360 Infiniband switch which uses psm via ch3:psm. The MVAPICH download page indicates we should use legacy 2.3.7 instead of 3.0. is that correct? we sometimes use ch3:nemesis in addition to ch3:psm in order to route MPI over ipoib when we have apps that want to use PSM directly via the libpsm_infinipath.so library. be happy to switch to MVAPICH 3.0 if it now works with psm libpsm_infinipath.so. chuck On Sat, Mar 09, 2024 at 05:12:17PM +0000, Subramoni, Hari wrote: > The ch3:nemesis has been deprecated. Please use the latest MVAPICH 3.0 for best performance and functionality. > > Thx, > Hari. _______________________________________________ Mvapich-discuss mailing list Mvapich-discuss at lists.osu.edu https://lists.osu.edu/mailman/listinfo/mvapich-discuss From panda at cse.ohio-state.edu Sat Mar 9 14:56:05 2024 From: panda at cse.ohio-state.edu (Panda, Dhabaleswar) Date: Sat, 9 Mar 2024 19:56:05 +0000 Subject: [Mvapich-discuss] Announcing the release of MVAPICH-Plus 3.0 GA In-Reply-To: References: Message-ID: The MVAPICH team is pleased to announce the release of MVAPICH-Plus 3.0 GA. The new MVAPICH-Plus series is an advanced version of the MVAPICH MPI library. It is targeted to support unified MVAPICH2-GDR and MVAPICH2-X features. It is also targeted to provide optimized support for modern platforms (CPU, GPU, and interconnects) for HPC, Deep Learning, Machine Learning, Big Data and Data Science applications. The major features and enhancements available in MVAPICH-Plus 3.0 GA are as follows: - Based on MVAPICH 3.0 - Support for various high-performance communication fabrics - InfiniBand, Slingshot-10/11, Omni-Path, OPX, RoCE, and Ethernet - Support naive CPU staging approach for collectives for small messages - Tune naive limits for the following systems - Frontier at OLCF, Pitzer at OSC, Owens at OSC, Ascend at OSC, Frontera at TACC, Lonestar6 at TACC, ThetaGPU at ALCF, Polaris at ALCF, Tioga at LLNL - Initial support for blocking collectives on NVIDIA and AMD GPUs - Allgather, Allgatherv, Allreduce, Alltoall, Alltoallv, Bcast, Gather, Gatherv, Reduce, Reduce_scatter, Scatter, Scatterv, Reduce_local, Reduce_scatter_block - Initial support for non-blocking GPU collectives on NVIDIA and AMD GPUs - Iallgather, Iallgatherv, Iallreduce, Ialltoall, Ialltoallv, Ibcast, Igather, Igatherv, Ireduce, Ireduce_scatter, Iscatter, Iscatterv - Enhanced collective and pt2pt tuning for NVIDIA Grace-Hopper systems - Enhanced collective tuning for NVIDIA V100, A100, H100 GPUs - Enhanced collective tuning for AMD MI100, and MI250x GPUs - Enhanced support for blocking and non-blocking GPU to GPU point-to-point operations on NVIDIA and AMD GPUs taking advantage of: - NVIDIA GDRCopy, AMD LargeBar support - CUDA and ROCM IPC support - Enhanced CPU tuning on various HPC systems and architectures - Stampede3 at TACC, Frontier at OLCF, Lonestar6 at TACC - AMD Rome, AMD Millan, Intel Sapphire Rapids - Tested with - Various HPC applications, mini-applications, and benchmarks - HiDL, MPI4DL, and MCR-DL packages for MPI-driven distributed training - MPI4cuML (a custom cuML package with MPI support) for scalable machine learning - Tested with CUDA <= 12.3 - Tested with ROCM <= 5.6.0 For downloading MVAPICH-Plus 3.0 GA library and associated user guide, please visit the following URL: http://mvapich.cse.ohio-state.edu All questions, feedback, bug reports, hints for performance tuning, patches, and enhancements are welcome. Please post it to the mvapich-discuss mailing list (mvapich-discuss at lists.osu.edu). Thanks, The MVAPICH Team PS: We are also happy to inform that the number of organizations using MVAPICH libraries (and registered at the MVAPICH site) has crossed 3,375 worldwide (in 91 countries). The number of downloads from the MVAPICH site has crossed 1,765,000 (1.765 million). The MVAPICH team would like to thank all its users and organizations!! From duzhuqi at qq.com Sat Mar 9 21:46:27 2024 From: duzhuqi at qq.com (=?gb18030?B?tsXn+Q==?=) Date: Sun, 10 Mar 2024 10:46:27 +0800 Subject: [Mvapich-discuss] When will MVAPICH fully support the MPI4 standard? Message-ID: Dear sir Currently, MPICH has fully supported the MPI4 standard. When will MVAPICH fully support the MPI4 standard? ?? duzhuqi at qq.com   -------------- next part -------------- An HTML attachment was scrubbed... URL: From panda at cse.ohio-state.edu Sun Mar 10 10:12:20 2024 From: panda at cse.ohio-state.edu (Panda, Dhabaleswar) Date: Sun, 10 Mar 2024 14:12:20 +0000 Subject: [Mvapich-discuss] When will MVAPICH fully support the MPI4 standard? In-Reply-To: References: Message-ID: It is being worked out and will be available soon. Please stay tuned. DK ________________________________________ From: Mvapich-discuss on behalf of ?? via Mvapich-discuss Sent: Saturday, March 9, 2024 9:46 PM To: mvapich-discuss Subject: [Mvapich-discuss] When will MVAPICH fully support the MPI4 standard? Dear sir Currently, MPICH has fully supported the MPI4 standard. When will MVAPICH fully support the MPI4 standard? ?? duzhuqi@?qq.?com ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. Report Suspicious ZjQcmQRYFpfptBannerEnd Dear sir Currently, MPICH has fully supported the MPI4 standard. When will MVAPICH fully support the MPI4 standard? [https://res.mail.qq.com/zh_CN/htmledition/images/rss/male.gif?rand=1617349242] ?? duzhuqi at qq.com From subramoni.1 at osu.edu Mon Mar 11 11:06:58 2024 From: subramoni.1 at osu.edu (Subramoni, Hari) Date: Mon, 11 Mar 2024 15:06:58 +0000 Subject: [Mvapich-discuss] Announcing the Release of OSU InfiniBand Analysis and Monitoring (INAM) Tool v1.1 Message-ID: The MVAPICH team is pleased to announce the release of OSU InfiniBand Network Analysis and Monitoring (INAM) Tool v1.1. OSU INAM monitors InfiniBand clusters in real time by querying various subnet management entities in the network. It is also capable of interacting with the MVAPICH2-X software stack to gain insights into the communication pattern of the application and classify the data transferred into Point-to-Point, Collective and Remote Memory Access (RMA). OSU INAM can also remotely monitor several parameters of MPI processes in conjunction with MVAPICH2-X. OSU INAM v1.1 (03/11/2024) * Major Features & Enhancements (since 1.0): - Support for ClickHouse Database to support real-time querying and visualization of very large HPC clusters (20,000+ nodes) - Support for up to 64 parallel insertions for multiple sources of profiling data - Support for up to 64 concurrent users to access OSU INAM with sub-second latency by using ClickHouse - Improved stability of OSU INAM operation - Reduced disk space by using ClickHouse - Change Default Bulk Insertion Size based on Database used to improve real-time view of network traffic - Extending notifications to support multiple criteria * Bug fixes - Fix issues loading certain switch nickname files - Fix a bug for showing link level information for live jobs For downloading OSU INAM v0.9.8 and associated user guide, please visit the following URL: http://mvapich.cse.ohio-state.edu All questions, feedback, bug reports, hints for performance tuning, and enhancements are welcome. Please post it to the mvapich-discuss mailing list (mvapich-discuss at lists.osu.edu). Thanks, The MVAPICH Team PS: We are also happy to inform that the number of organizations using MVAPICH libraries (and registered at the MVAPICH site) has crossed 3,375 worldwide (in 91 countries). The number of downloads from the MVAPICH site has crossed 1,765,000 (1.765 million). The MVAPICH team would like to thank all its users and organizations!! -------------- next part -------------- An HTML attachment was scrubbed... URL: From chuck at ece.cmu.edu Mon Mar 11 13:54:57 2024 From: chuck at ece.cmu.edu (Chuck Cranor) Date: Mon, 11 Mar 2024 13:54:57 -0400 Subject: [Mvapich-discuss] ch3:nemesis mvapich2-2.3.7-1 compile error In-Reply-To: References: Message-ID: !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------! hi- I don't think so. OPX appears to be targeting OmniPath adaptors which show up as /dev/hfi1 (e.g. see libfabric's prov/opx/include/opa_service.h). The intent appears to be to replace the PSM2 library with the OPX provider under the libfabric API. We have the previous pre-OmniPath version of the adaptor (the intel/qlogic qle7300 series) which shows up as /dev/ipath0 and uses the PSM library (instead of PSM2). chuck On Sat, Mar 09, 2024 at 07:30:34PM +0000, Panda, Dhabaleswar wrote: > Hi Chuck, > > Are you able to install Cornelis OPX library over OFI on your system? If so, you should be able to use the latest MVAPICH 3.0 and MVPICH-Plus 3.0 on your system without any problem. > > Thanks, From panda at cse.ohio-state.edu Mon Mar 11 18:41:45 2024 From: panda at cse.ohio-state.edu (Panda, Dhabaleswar K.) Date: Mon, 11 Mar 2024 22:41:45 +0000 Subject: [Mvapich-discuss] Save the Dates for MUG '24 Conference Message-ID: We are happy to indicate that the 12th annual MVAPICH User Group (MUG) conference will take place in Columbus, OH, USA during August 19-21, 2024. It will be an in-person event with an option for remote attendance. Please save the dates and stay tuned for future announcements!! More details on the conference are available from http://mug.mvapich.cse.ohio-state.edu/ Thanks, The MUG '24 Organizers PS: Interested in getting announcements related to the MUG events? Please subscribe to the MUG Conference Mailing list (available from the MUG conference page). From antoine.morvan at eviden.com Mon Mar 18 05:53:58 2024 From: antoine.morvan at eviden.com (ANTOINE MORVAN) Date: Mon, 18 Mar 2024 09:53:58 +0000 Subject: [Mvapich-discuss] Configure issue: enables MPI 4 suite when detecting MPI_Session_Init Message-ID: !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------! Hello, I was playing with the OSU benchmark 7.3 with Intel OneAPI & Intel MPI 2024 version. By default, the build is failing with some undefined symbols: make[4]: *** Waiting for unfinished jobs.... osu_bcast_persistent.o: In function `main': osu_bcast_persistent.c:(.text+0x3f3): undefined reference to `MPI_Bcast_init' osu_reduce_persistent.o: In function `main': osu_reduce_persistent.c:(.text+0x41d): undefined reference to `MPI_Reduce_init' osu_allreduce_persistent.o: In function `main': osu_allreduce_persistent.c:(.text+0x41b): undefined reference to `MPI_Allreduce_init' osu_alltoall_persistent.o: In function `main': osu_alltoall_persistent.c:(.text+0x4c5): undefined reference to `MPI_Alltoall_init' osu_gather_persistent.o: In function `main': osu_gather_persistent.c:(.text+0x43b): undefined reference to `MPI_Gather_init' osu_scatter_persistent.o: In function `main': osu_scatter_persistent.c:(.text+0x4a6): undefined reference to `MPI_Scatter_init' osu_scatter_persistent.c:(.text+0x4f3): undefined reference to `MPI_Scatter_init' After further investigation, it appears that the configure script of the OSU benchmarks is enabling the MPI 4 benchmarks when detecting MPI_Session_Init available in the MPI implementation. The Intel MPI 2021 update 11 bundled with OneAPI 2024 does provide an implementation for MPI sessions (https://urldefense.com/v3/__https://www.intel.com/content/www/us/en/developer/articles/release-notes/mpi-library-release-notes.html__;!!KGKeukY!2fwDrANnfX4-Ec39ASWOQKXsoFaVuD_wWBcF5egWOKZ8PB4qviFiBZfuBdyQKRpeo50eSNprVVLqxdIhKsnN3RBFcZ-EVee7tmAN$ ) but does not exposes all of the MPI 4 API. So, relying on just the sessions would not be enough to ensure the MPI implementation is MPI 4 ready. On top of this, the configure help clearly states that MPI 4 benchmarks are disabled bby default : $OSU_SRC/configure --help [...] --enable-mpi4 Enables MPI4 support and features for benchmarks (default is false) So either the help is not up to date (it does enable the MPI 4 depending on some configure logics); or the configure script is faulty. Best regards, Antoine Morvan?(He/Him) HPC Application Expert?- CEPP (Center for Excellence in Performance Programming) - HPC, AI & Quantum GBL M: +33 (0) 6 43 12 98 21 12 F RUE DU PATIS TATELIN?- 35700 Rennes?- France eviden.com From purum5548 at konkuk.ac.kr Tue Mar 19 00:01:34 2024 From: purum5548 at konkuk.ac.kr (=?ks_c_5601-1987?B?vK3Hqrin?=) Date: Tue, 19 Mar 2024 04:01:34 +0000 Subject: [Mvapich-discuss] [LiMIC2] Hugepage support patch for LiMIC2(LiMIC2-0.5.8) Message-ID: Dear MVAPICH team. Hi, I want to submit updated LiMIC2 module.(LiMIC2-0.5.8) This module includes below implemented functions * work with transparent huge page(THP) and hugetlbfs * reduce per page overhead(get/release pages, mapping pages) with hugepages * support for newer kernel This module is implementation of this paper. Performance measurement and implementation details are included in this paper. https://urldefense.com/v3/__https://dl.acm.org/doi/10.1145/3127024.3127035__;!!KGKeukY!2wW6S05kswvOZwbRrIlDVyMmMbrikCbUqjJIHghsteh0780hZf3A7eZbquUyAeOWtorCRVe5LSfYJZhg5Imqs5OnFWV7xF8v2A$ Thanks, Purum. -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: limic2-0.5.8.tar Type: application/x-tar Size: 666440 bytes Desc: limic2-0.5.8.tar URL: From subramoni.1 at osu.edu Tue Mar 19 17:55:32 2024 From: subramoni.1 at osu.edu (Subramoni, Hari) Date: Tue, 19 Mar 2024 21:55:32 +0000 Subject: [Mvapich-discuss] [LiMIC2] Hugepage support patch for LiMIC2(LiMIC2-0.5.8) In-Reply-To: References: Message-ID: Hi, Purum. Thanks for the patch. We will review it and take it in with an acknowledgement to you. Best, Hari. From: Mvapich-discuss On Behalf Of ??? via Mvapich-discuss Sent: Tuesday, March 19, 2024 12:02 AM To: Mvapich-discuss at lists.osu.edu Cc: jinh at konkuk.ac.kr Subject: [Mvapich-discuss] [LiMIC2] Hugepage support patch for LiMIC2(LiMIC2-0.5.8) Dear MVAPICH team. Hi, I want to submit updated LiMIC2 module.?(LiMIC2-0.?5.?8) This module includes below implemented functions work with transparent huge page(THP) and hugetlbfs reduce per page overhead(get/release pages, mapping pages) with ZjQcmQRYFpfptBannerStart This Message Is From an External Sender This message came from outside your organization. Report Suspicious ? ZjQcmQRYFpfptBannerEnd Dear MVAPICH team. Hi, I want to submit updated LiMIC2 module.(LiMIC2-0.5.8) This module includes below implemented functions * work with transparent huge page(THP) and hugetlbfs * reduce per page overhead(get/release pages, mapping pages) with hugepages * support for newer kernel This module is implementation of this paper. Performance measurement and implementation details are included in this paper. https://dl.acm.org/doi/10.1145/3127024.3127035 Thanks, Purum. -------------- next part -------------- An HTML attachment was scrubbed... URL: From subramoni.1 at osu.edu Tue Mar 19 17:57:08 2024 From: subramoni.1 at osu.edu (Subramoni, Hari) Date: Tue, 19 Mar 2024 21:57:08 +0000 Subject: [Mvapich-discuss] Configure issue: enables MPI 4 suite when detecting MPI_Session_Init In-Reply-To: References: Message-ID: Hi, Antoine. Many thanks for reporting this issue to us. Your observation is correct. We will look into this and fix it. It should be available with the next release of OMB. Best, Hari. -----Original Message----- From: Mvapich-discuss On Behalf Of ANTOINE MORVAN via Mvapich-discuss Sent: Monday, March 18, 2024 5:54 AM To: mvapich-discuss at lists.osu.edu Subject: [Mvapich-discuss] Configure issue: enables MPI 4 suite when detecting MPI_Session_Init !-------------------------------------------------------------------| This Message Is From an External Sender This message came from outside your organization. |-------------------------------------------------------------------! Hello, I was playing with the OSU benchmark 7.3 with Intel OneAPI & Intel MPI 2024 version. By default, the build is failing with some undefined symbols: make[4]: *** Waiting for unfinished jobs.... osu_bcast_persistent.o: In function `main': osu_bcast_persistent.c:(.text+0x3f3): undefined reference to `MPI_Bcast_init' osu_reduce_persistent.o: In function `main': osu_reduce_persistent.c:(.text+0x41d): undefined reference to `MPI_Reduce_init' osu_allreduce_persistent.o: In function `main': osu_allreduce_persistent.c:(.text+0x41b): undefined reference to `MPI_Allreduce_init' osu_alltoall_persistent.o: In function `main': osu_alltoall_persistent.c:(.text+0x4c5): undefined reference to `MPI_Alltoall_init' osu_gather_persistent.o: In function `main': osu_gather_persistent.c:(.text+0x43b): undefined reference to `MPI_Gather_init' osu_scatter_persistent.o: In function `main': osu_scatter_persistent.c:(.text+0x4a6): undefined reference to `MPI_Scatter_init' osu_scatter_persistent.c:(.text+0x4f3): undefined reference to `MPI_Scatter_init' After further investigation, it appears that the configure script of the OSU benchmarks is enabling the MPI 4 benchmarks when detecting MPI_Session_Init available in the MPI implementation. The Intel MPI 2021 update 11 bundled with OneAPI 2024 does provide an implementation for MPI sessions (https://urldefense.com/v3/__https://www.intel.com/content/www/us/en/developer/articles/release-notes/mpi-library-release-notes.html__;!!KGKeukY!2fwDrANnfX4-Ec39ASWOQKXsoFaVuD_wWBcF5egWOKZ8PB4qviFiBZfuBdyQKRpeo50eSNprVVLqxdIhKsnN3RBFcZ-EVee7tmAN$ ) but does not exposes all of the MPI 4 API. So, relying on just the sessions would not be enough to ensure the MPI implementation is MPI 4 ready. On top of this, the configure help clearly states that MPI 4 benchmarks are disabled bby default : $OSU_SRC/configure --help [...] --enable-mpi4 Enables MPI4 support and features for benchmarks (default is false) So either the help is not up to date (it does enable the MPI 4 depending on some configure logics); or the configure script is faulty. Best regards, Antoine Morvan?(He/Him) HPC Application Expert?- CEPP (Center for Excellence in Performance Programming) - HPC, AI & Quantum GBL M: +33 (0) 6 43 12 98 21 12 F RUE DU PATIS TATELIN?- 35700 Rennes?- France eviden.com _______________________________________________ Mvapich-discuss mailing list Mvapich-discuss at lists.osu.edu https://lists.osu.edu/mailman/listinfo/mvapich-discuss From Mahdieh.Ghazimirsaeed at amd.com Wed Mar 27 12:56:26 2024 From: Mahdieh.Ghazimirsaeed at amd.com (Ghazimirsaeed, Mahdieh) Date: Wed, 27 Mar 2024 16:56:26 +0000 Subject: [Mvapich-discuss] Fix for OMB installation with ROCm 6 Message-ID: [AMD Official Use Only - General] Hi, The followingg error shows up when building OSU microbenchmark (7.3) with rocm6.0.2: cannot include hip/hip_runtime_api.h To resolve this issue, I grep?ed for __HIP_PLATFORM_HCC__ and replaced it with __HIP_PLATFORM_AMD__. Hope this is fixed in upcoming OMB releases. Best, Mahdieh -------------- next part -------------- An HTML attachment was scrubbed... URL: