[Mvapich-discuss] Error compiling with CUDA 11.4 and GNU 11.1.0

Subramoni, Hari subramoni.1 at osu.edu
Tue Aug 16 15:00:01 EDT 2022


Hello, Dr. Levi.

We only have basic support for GPU-enabled clusters in MVAPICH2 2.3.7.

For best performance, functionality, and the latest features on GPU-enabled clusters, we strongly recommend using MVAPICH2-GDR. It is available as an RPM package from our download page.

If you do not find the exact version you’re looking for, kindly fill out this form and we can build it for you.

http://mvapich.cse.ohio-state.edu/GDRform/

Best,
Hari.

From: Mvapich-discuss <mvapich-discuss-bounces at lists.osu.edu> On Behalf Of Levi, Mariana via Mvapich-discuss
Sent: Tuesday, August 16, 2022 2:48 PM
To: mvapich-discuss at lists.osu.edu
Subject: [Mvapich-discuss] Error compiling with CUDA 11.4 and GNU 11.1.0

Hi MVAPICH2 team, I’m trying to install a MVAPICH 2.3.7 with GNU 11.1.0 and CUDA 11.4 on an HPC cluster (Centos 7). I’m using the following command: FFLAGS="-w -fallow-argument-mismatch -O2" ../configure --prefix=/shared/centos7/mvapich2/2.3.7-gcc11.1-cuda11.4
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.
    Report Suspicious  <https://us-phishalarm-ewt.proofpoint.com/EWT/v1/KGKeukY!vwQdMiaND6YApRdxfo6ldr_NiARY1aK_J5CDbp34OLr9daJwvRXAc56tFlO87kf5FpxQOzYhfnwwN7JfmI6qf_b0CyCEaSE20JNWWzPpI-87bF8AA25c8F-u4i5LK4uWxGDOnOMUTXCRBre1mlESBA$>   ‌
ZjQcmQRYFpfptBannerEnd
Hi MVAPICH2 team,

I’m trying to install a MVAPICH 2.3.7 with GNU 11.1.0 and CUDA 11.4 on an HPC cluster (Centos 7). I’m using the following command:

FFLAGS="-w -fallow-argument-mismatch -O2" ../configure --prefix=/shared/centos7/mvapich2/2.3.7-gcc11.1-cuda11.4 --with-device=ch3:mrail --with-rdma=gen2 --enable-threads=multiple --enable-fortran=all --enable-fast --with-pmi=pmi2 --with-pm=slurm --enable-slurm=yes --with-libcuda=/shared/centos7/cuda/11.4/targets/x86_64-linux/lib/stubs --with-libcudart=/shared/centos7/cuda/11.4/targets/x86_64-linux/lib --with-cuda=/shared/centos7/cuda/11.4 --with-cuda-include=/shared/centos7/cuda/11.4/include --with-cuda-libpath=/shared/centos7/cuda/11.4/lib64

make -j

The architecture I’m building on is Intel(R) Xeon(R) Gold 6132 CPU @ 2.60GHz (Skylake_avx512 microarchitecture) with V100 NVIDIA GPUs, and InfiniBand network (Mellanox OFED version 5.3) support.

Could you please assist with the following errors I’m getting:

../../../../contrib/hwloc_v1/src/topology-opencl.c: In function ‘hwloc_opencl_query_devices’:
../../../../contrib/hwloc_v1/src/topology-opencl.c:108:5: error: unknown type name ‘cl_device_topology_amd’
  108 |     cl_device_topology_amd amdtopo;
      |     ^~~~~~~~~~~~~~~~~~~~~~
../../../../contrib/hwloc_v1/src/topology-opencl.c:171:9: error: ‘CL_DEVICE_TOPOLOGY_TYPE_PCIE_AMD’ undeclared (first use in this function); did you mean ‘CL_DEVICE_TOPOLOGY_AMD’?
  171 |     if (CL_DEVICE_TOPOLOGY_TYPE_PCIE_AMD != amdtopo.raw.type) {
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |         CL_DEVICE_TOPOLOGY_AMD
../../../../contrib/hwloc_v1/src/topology-opencl.c:171:9: note: each undeclared identifier is reported only once for each function it appears in
../../../../contrib/hwloc_v1/src/topology-opencl.c:171:52: error: request for member ‘raw’ in something not a structure or union
  171 |     if (CL_DEVICE_TOPOLOGY_TYPE_PCIE_AMD != amdtopo.raw.type) {
      |                                                    ^
../../../../contrib/hwloc_v1/src/topology-opencl.c:172:53: error: request for member ‘raw’ in something not a structure or union
  172 |       hwloc_debug("not a PCIe device: %u\n", amdtopo.raw.type);
      |                                                     ^
../../../../contrib/hwloc_v1/src/topology-opencl.c:178:40: error: request for member ‘pcie’ in something not a structure or union
  178 |     info->specific.amd.pcibus = amdtopo.pcie.bus;
      |                                        ^
../../../../contrib/hwloc_v1/src/topology-opencl.c:179:40: error: request for member ‘pcie’ in something not a structure or union
  179 |     info->specific.amd.pcidev = amdtopo.pcie.device;
      |                                        ^
../../../../contrib/hwloc_v1/src/topology-opencl.c:180:41: error: request for member ‘pcie’ in something not a structure or union
  180 |     info->specific.amd.pcifunc = amdtopo.pcie.function;
      |                                         ^
../../../../contrib/hwloc_v1/src/topology-opencl.c:183:35: error: request for member ‘pcie’ in something not a structure or union
  183 |                 (unsigned) amdtopo.pcie.bus, (unsigned) amdtopo.pcie.device, (unsigned) amdtopo.pcie.function);
      |                                   ^
../../../../contrib/hwloc_v1/src/topology-opencl.c:183:64: error: request for member ‘pcie’ in something not a structure or union
  183 |                 (unsigned) amdtopo.pcie.bus, (unsigned) amdtopo.pcie.device, (unsigned) amdtopo.pcie.function);
      |                                                                ^
../../../../contrib/hwloc_v1/src/topology-opencl.c:183:96: error: request for member ‘pcie’ in something not a structure or union
  183 |                 (unsigned) amdtopo.pcie.bus, (unsigned) amdtopo.pcie.device, (unsigned) amdtopo.pcie.function);
      |                                                                                                ^
make[3]: *** [topology-opencl.lo] Error 1
make[3]: *** Waiting for unfinished jobs....
make[3]: Leaving directory `/shared/centos7/mvapich2/src/mvapich2-2.3.7/build-gcc11.1-cuda11.4/contrib/hwloc_v1/src'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/shared/centos7/mvapich2/src/mvapich2-2.3.7/build-gcc11.1-cuda11.4/contrib/hwloc_v1'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/shared/centos7/mvapich2/src/mvapich2-2.3.7/build-gcc11.1-cuda11.4'
make: *** [all] Error 2

I’ve also attached the config.log file for reference.

Thanks in advance for your assistance.

Best,

Mariana Levi, Ph.D.
Computational Scientist
Research Computing, Information Technology Services
Northeastern University
617-470-4022

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20220816/19503fc1/attachment-0015.html>


More information about the Mvapich-discuss mailing list