[Mvapich-discuss] Cannot build Horovod with mvapich2-gdr
Anthony, Quentin G.
anthony.301 at buckeyemail.osu.edu
Sat Jul 17 11:01:48 EDT 2021
Hey Zhi-Qiang,
Currently, we recommend manually updating the prefix, exec_prefix, sysconfdir, includedir, and libdir paths in mpicc and mpicxx.
As long as your loaded CUDA module version matches that of the RPM, the CUDA prefix paths should be correct for your system. If that's not the case, let us know and we can generate you a new RPM with the correct CUDA version and prefix.
Thanks,
-Quentin
________________________________
From: Mvapich-discuss <mvapich-discuss-bounces+anthony.301=osu.edu at lists.osu.edu> on behalf of You, Zhi-Qiang via Mvapich-discuss <mvapich-discuss at lists.osu.edu>
Sent: Friday, July 16, 2021 1:31 PM
To: mvapich-discuss at lists.osu.edu <mvapich-discuss at lists.osu.edu>
Subject: Re: [Mvapich-discuss] Cannot build Horovod with mvapich2-gdr
Hello, any update?
-ZQ
From: Mvapich-discuss <mvapich-discuss-bounces at lists.osu.edu> on behalf of You, Zhi-Qiang via Mvapich-discuss <mvapich-discuss at lists.osu.edu>
Date: Saturday, July 10, 2021 at 3:13 AM
To: mvapich-discuss at lists.osu.edu <mvapich-discuss at lists.osu.edu>
Subject: [Mvapich-discuss] Cannot build Horovod with mvapich2-gdr
Hello,
I followed the user guide[1] to build horovod with mvapich2-gdr at OSC and got this CMake error:
$ module reset
$ module load mvapich2-gdr/2.3.5 cmake
$ HOROVOD_GPU_OPERATIONS=MPI HOROVOD_CUDA_HOME=$CUDA_HOME HOROVOD_WITH_MPI=1 pip install --no-cache-dir --ignore-installed horovod
[ .. skipped .. ]
-- The CXX compiler identification is GNU 8.4.0
-- Check for working CXX compiler: /apps/gnu/8.4.0/bin/c++
-- Check for working CXX compiler: /apps/gnu/8.4.0/bin/c++ - works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Build architecture flags: -mf16c -mavx -mfma
-- Using command /fs/ess/scratch/PZS0710/zyou/tmp/horovod/bin/python3
CMake Error in /tmp/pip-install-xt_qt24z/horovod_6386bb2d1fc94c1f9518143ed589a550/build/temp.linux-x86_64-3.6/RelWithDebInfo/CMakeFiles/CMakeTmp/CMakeLists.txt:
Imported target "MPI::MPI_CXX" includes non-existent path
"/usr/local/cuda-11.0/include"
in its INTERFACE_INCLUDE_DIRECTORIES. Possible reasons include:
[ .. skipped .. ]
I noticed "/usr/local/cuda-11.0/include” from the flags in mpi wrappers:
$ grep final.*flags= `which mpicxx`
final_cxxflags=" -O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches -m64 -mtune=generic "
final_cppflags=" -I/usr/local/cuda-11.0/include -I/usr/local/cuda-11.0/include"
final_ldflags=" -L/usr/local/lib -lcuda -L/usr/local/cuda-11.0/lib64/stubs -L/usr/local/cuda-11.0/lib64 -lcudart -lrt -lstdc++ -Wl,-rpath,/usr/local/cuda-11.0/lib64 -Wl,-rpath,XORIGIN/placeholder -L/usr/local/software/slurm/current/lib64/ -fPIC -m64 "
final_ldflags="${final_ldflags} -L/usr/local/cuda-11.0/lib64/ -L/usr/local/lib -lcuda -L/usr/local/cuda-11.0/lib64/stubs -L/usr/local/cuda-11.0/lib64 -lcudart -lrt -lstdc++ -Wl,-rpath,/usr/local/cuda-11.0/lib64 -Wl,-rpath,XORIGIN/placeholder -L/usr/local/software/slurm/current/lib64/ -fPIC -m64 -L/usr/local/software/slurm/current/lib"
At OSC, we use rpm2cpio method to install mvapich2-gdr. I know using rpm with –prefix can fix part of the flags but it seems no other method to update cuda path automatically. Besides manual modifications to the flags in the wrappers, is there other way to let me proceed to build horovod?
-ZQ
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.osu.edu/pipermail/mvapich-discuss/attachments/20210717/c40137c5/attachment-0022.html>
More information about the Mvapich-discuss
mailing list