[Mvapich-discuss] Cannot build Horovod with mvapich2-gdr

You, Zhi-Qiang zyou at osc.edu
Fri Jul 9 23:33:16 EDT 2021


Hello,

I followed the user guide[1] to build horovod with mvapich2-gdr at OSC and got this CMake error:

$ module reset
$ module load mvapich2-gdr/2.3.5 cmake
$ HOROVOD_GPU_OPERATIONS=MPI HOROVOD_CUDA_HOME=$CUDA_HOME HOROVOD_WITH_MPI=1 pip install --no-cache-dir --ignore-installed horovod
[ .. skipped .. ]
-- The CXX compiler identification is GNU 8.4.0
    -- Check for working CXX compiler: /apps/gnu/8.4.0/bin/c++
    -- Check for working CXX compiler: /apps/gnu/8.4.0/bin/c++ - works
    -- Detecting CXX compiler ABI info
    -- Detecting CXX compiler ABI info - done
    -- Detecting CXX compile features
    -- Detecting CXX compile features - done
    -- Build architecture flags: -mf16c -mavx -mfma
    -- Using command /fs/ess/scratch/PZS0710/zyou/tmp/horovod/bin/python3
    CMake Error in /tmp/pip-install-xt_qt24z/horovod_6386bb2d1fc94c1f9518143ed589a550/build/temp.linux-x86_64-3.6/RelWithDebInfo/CMakeFiles/CMakeTmp/CMakeLists.txt:
      Imported target "MPI::MPI_CXX" includes non-existent path

        "/usr/local/cuda-11.0/include"

      in its INTERFACE_INCLUDE_DIRECTORIES.  Possible reasons include:
[ .. skipped .. ]

I noticed "/usr/local/cuda-11.0/include” from the flags in mpi wrappers:

$ grep final.*flags= `which mpicxx`
final_cxxflags=" -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches   -m64 -mtune=generic "
final_cppflags=" -I/usr/local/cuda-11.0/include  -I/usr/local/cuda-11.0/include"
final_ldflags=" -L/usr/local/lib -lcuda -L/usr/local/cuda-11.0/lib64/stubs -L/usr/local/cuda-11.0/lib64 -lcudart -lrt -lstdc++ -Wl,-rpath,/usr/local/cuda-11.0/lib64 -Wl,-rpath,XORIGIN/placeholder -L/usr/local/software/slurm/current/lib64/  -fPIC -m64 "
    final_ldflags="${final_ldflags} -L/usr/local/cuda-11.0/lib64/ -L/usr/local/lib -lcuda -L/usr/local/cuda-11.0/lib64/stubs -L/usr/local/cuda-11.0/lib64 -lcudart -lrt -lstdc++ -Wl,-rpath,/usr/local/cuda-11.0/lib64 -Wl,-rpath,XORIGIN/placeholder -L/usr/local/software/slurm/current/lib64/  -fPIC -m64 -L/usr/local/software/slurm/current/lib"

At OSC, we use rpm2cpio method to install mvapich2-gdr. I know using rpm with –prefix can fix part of the flags but it seems no other method to update cuda path automatically. Besides manual modifications to the flags in the wrappers, is there other way to let me proceed to build horovod?

-ZQ

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20210710/efc9ca2e/attachment-0021.html>


More information about the Mvapich-discuss mailing list