[Mvapich-discuss] Error compiling with CUDA 11.4 and GNU 11.1.0

Levi, Mariana m.levi at northeastern.edu
Tue Aug 16 14:48:09 EDT 2022


Hi MVAPICH2 team,

I’m trying to install a MVAPICH 2.3.7 with GNU 11.1.0 and CUDA 11.4 on an HPC cluster (Centos 7). I’m using the following command:

FFLAGS="-w -fallow-argument-mismatch -O2" ../configure --prefix=/shared/centos7/mvapich2/2.3.7-gcc11.1-cuda11.4 --with-device=ch3:mrail --with-rdma=gen2 --enable-threads=multiple --enable-fortran=all --enable-fast --with-pmi=pmi2 --with-pm=slurm --enable-slurm=yes --with-libcuda=/shared/centos7/cuda/11.4/targets/x86_64-linux/lib/stubs --with-libcudart=/shared/centos7/cuda/11.4/targets/x86_64-linux/lib --with-cuda=/shared/centos7/cuda/11.4 --with-cuda-include=/shared/centos7/cuda/11.4/include --with-cuda-libpath=/shared/centos7/cuda/11.4/lib64

make -j

The architecture I’m building on is Intel(R) Xeon(R) Gold 6132 CPU @ 2.60GHz (Skylake_avx512 microarchitecture) with V100 NVIDIA GPUs, and InfiniBand network (Mellanox OFED version 5.3) support.

Could you please assist with the following errors I’m getting:

../../../../contrib/hwloc_v1/src/topology-opencl.c: In function ‘hwloc_opencl_query_devices’:
../../../../contrib/hwloc_v1/src/topology-opencl.c:108:5: error: unknown type name ‘cl_device_topology_amd’
  108 |     cl_device_topology_amd amdtopo;
      |     ^~~~~~~~~~~~~~~~~~~~~~
../../../../contrib/hwloc_v1/src/topology-opencl.c:171:9: error: ‘CL_DEVICE_TOPOLOGY_TYPE_PCIE_AMD’ undeclared (first use in this function); did you mean ‘CL_DEVICE_TOPOLOGY_AMD’?
  171 |     if (CL_DEVICE_TOPOLOGY_TYPE_PCIE_AMD != amdtopo.raw.type) {
      |         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |         CL_DEVICE_TOPOLOGY_AMD
../../../../contrib/hwloc_v1/src/topology-opencl.c:171:9: note: each undeclared identifier is reported only once for each function it appears in
../../../../contrib/hwloc_v1/src/topology-opencl.c:171:52: error: request for member ‘raw’ in something not a structure or union
  171 |     if (CL_DEVICE_TOPOLOGY_TYPE_PCIE_AMD != amdtopo.raw.type) {
      |                                                    ^
../../../../contrib/hwloc_v1/src/topology-opencl.c:172:53: error: request for member ‘raw’ in something not a structure or union
  172 |       hwloc_debug("not a PCIe device: %u\n", amdtopo.raw.type);
      |                                                     ^
../../../../contrib/hwloc_v1/src/topology-opencl.c:178:40: error: request for member ‘pcie’ in something not a structure or union
  178 |     info->specific.amd.pcibus = amdtopo.pcie.bus;
      |                                        ^
../../../../contrib/hwloc_v1/src/topology-opencl.c:179:40: error: request for member ‘pcie’ in something not a structure or union
  179 |     info->specific.amd.pcidev = amdtopo.pcie.device;
      |                                        ^
../../../../contrib/hwloc_v1/src/topology-opencl.c:180:41: error: request for member ‘pcie’ in something not a structure or union
  180 |     info->specific.amd.pcifunc = amdtopo.pcie.function;
      |                                         ^
../../../../contrib/hwloc_v1/src/topology-opencl.c:183:35: error: request for member ‘pcie’ in something not a structure or union
  183 |                 (unsigned) amdtopo.pcie.bus, (unsigned) amdtopo.pcie.device, (unsigned) amdtopo.pcie.function);
      |                                   ^
../../../../contrib/hwloc_v1/src/topology-opencl.c:183:64: error: request for member ‘pcie’ in something not a structure or union
  183 |                 (unsigned) amdtopo.pcie.bus, (unsigned) amdtopo.pcie.device, (unsigned) amdtopo.pcie.function);
      |                                                                ^
../../../../contrib/hwloc_v1/src/topology-opencl.c:183:96: error: request for member ‘pcie’ in something not a structure or union
  183 |                 (unsigned) amdtopo.pcie.bus, (unsigned) amdtopo.pcie.device, (unsigned) amdtopo.pcie.function);
      |                                                                                                ^
make[3]: *** [topology-opencl.lo] Error 1
make[3]: *** Waiting for unfinished jobs....
make[3]: Leaving directory `/shared/centos7/mvapich2/src/mvapich2-2.3.7/build-gcc11.1-cuda11.4/contrib/hwloc_v1/src'
make[2]: *** [all-recursive] Error 1
make[2]: Leaving directory `/shared/centos7/mvapich2/src/mvapich2-2.3.7/build-gcc11.1-cuda11.4/contrib/hwloc_v1'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/shared/centos7/mvapich2/src/mvapich2-2.3.7/build-gcc11.1-cuda11.4'
make: *** [all] Error 2

I’ve also attached the config.log file for reference.

Thanks in advance for your assistance.

Best,

Mariana Levi, Ph.D.
Computational Scientist
Research Computing, Information Technology Services
Northeastern University
617-470-4022

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20220816/1f3e5c6c/attachment-0014.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: config.log
Type: application/octet-stream
Size: 630841 bytes
Desc: config.log
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20220816/1f3e5c6c/attachment-0014.obj>


More information about the Mvapich-discuss mailing list