[mvapich-discuss] Cannot build MVAPICH 2.3.4 from source: helper_fns.h not found

Subramoni, Hari subramoni.1 at osu.edu
Mon Jul 20 12:24:06 EDT 2020


Hi, Fang.

My pleasure.

It looks like you do have OFED libraries installed on your system. Do you know if it is Mellanox OFED or regular OFED from distros? Can you execute the “ofed_info” command (if it exists) and send us the first line from it? This will tell us the version of Mellanox OFED you have on your system.

Best,
Hari.

From: Fang, Leo <leofang at bnl.gov>
Sent: Monday, July 20, 2020 12:16 PM
To: Subramoni, Hari <subramoni.1 at osu.edu>
Cc: mvapich-discuss at cse.ohio-state.edu <mvapich-discuss at mailman.cse.ohio-state.edu>
Subject: Re: [mvapich-discuss] Cannot build MVAPICH 2.3.4 from source: helper_fns.h not found

Dear Hari,


Thanks (as always :D) for your prompt replies! The reason I used ch3:nemesis is that on this test machine without IB or any other high-speed network, I got the following error if I set to ch3:mrail (or don’t set any device):

configure: error: 'libibverbs not found. Did you specify --with-ib-libpath=?’

But I just realized libibverbs still exists, and this works for me:

./configure --enable-cuda --prefix=/home/leofang/.mvapich2-2.3.4_cuda --with-pm=hydra --with-ibverbs-include=/usr/include/infiniband/ --with-ibverbs-lib=/usr/lib/x86_64-linux-gnu/ --disable-mcast

(Yeah, I think I did see "YACC is not found” error when building, but after I did "sudo apt install byacc" the scanner.c error showed up instead of complaining yacc not found. But it’s not reproducible now...)

As for not using GDR (which is what I would like to really use in a production env), the reason is this machine missed libibmad.so and libibumad.so to which libmpi.so is linked. I’ll try looking into that.

Thanks again.


Sincerely,
Leo

---
Yao-Lung Leo Fang
Assistant Computational Scientist
Computational Science Initiative
Brookhaven National Laboratory
Bldg. 725, Room 2-169
P.O. Box 5000, Upton, NY 11973-5000
Office: (631) 344-3265
Email: leofang at bnl.gov<mailto:leofang at bnl.gov>
Website: https://leofang.github.io/<https://urldefense.com/v3/__https:/leofang.github.io/__;!!KGKeukY!gnQZO9U3kjjN-_PPX7UnYigbbz91QbS9djqp-JMV4Fep9N9-D998tbMSwhf8oV4Zlg$>


Subramoni, Hari <subramoni.1 at osu.edu<mailto:subramoni.1 at osu.edu>> 於 2020年7月20日 上午9:49 寫道:

Hi, Leo.

The ch3:nemesis branch has been deprecated. Please use the ch3:mrail branch. For running on GPUs, we strongly recommend using MVAPICH2-GDR for best performance and scalability. This is even true if it is a single node system since MVAPICH2-GDR has a lot of advanced support for intra-node point-to-point and collective communication.

The error with scanner.c is because lex/yacc is not present on your system. Another user reported this on discuss and we will be resolving this with the next release.

Best,
Hari.

From: mvapich-discuss-bounces at cse.ohio-state.edu<mailto:mvapich-discuss-bounces at cse.ohio-state.edu> <mvapich-discuss-bounces at mailman.cse.ohio-state.edu<mailto:mvapich-discuss-bounces at mailman.cse.ohio-state.edu>> On Behalf Of Fang, Leo
Sent: Monday, July 20, 2020 12:44 AM
To: mvapich-discuss at cse.ohio-state.edu<mailto:mvapich-discuss at cse.ohio-state.edu> <mvapich-discuss at mailman.cse.ohio-state.edu<mailto:mvapich-discuss at mailman.cse.ohio-state.edu>>
Subject: [mvapich-discuss] Cannot build MVAPICH 2.3.4 from source: helper_fns.h not found

Hello,


I tried to build MVAPICH 2.3.4 from source with the following command:

./configure --enable-cuda --prefix=/home/leofang/.mvapich2-2.3.4_cuda --with-device=ch3:nemesis --with-pm=hydra

But the build failed with "fatal error: helper_fns.h: No such file or directory” when compiling source codes from src/mpi/coll/. Occasionally the error is: "error: src/pm/mpirun/src/hostfile/scanner.c: No such file or directory”, but it’s not always reproducible.

The machine I’m on has no infiniband or any high-speed network or job scheduler, just an isolated test node with GPUs.

I had a vague impression that this worked in the past versions, but I am not 100% sure about the build command I used. Also, with a quick search I found helper_fns.h is newly added in 2.3.4: https://fossies.org/diffs/mvapich2/2.3.3_vs_2.3.4/index.html<https://urldefense.com/v3/__https:/fossies.org/diffs/mvapich2/2.3.3_vs_2.3.4/index.html__;!!KGKeukY!hmScmeImOTuuWL8EW62m9cvsuaf3mYaLrA4dqZr4GdDngv87DPvlXZgSc4YaRRCR33t_vGAmD9KRF4Q$>. Could it be that the configure script failed to copy it to the build path in certain cases?

Thanks.


Sincerely,
Leo

---
Yao-Lung Leo Fang
Assistant Computational Scientist
Computational Science Initiative
Brookhaven National Laboratory
Bldg. 725, Room 2-169
P.O. Box 5000, Upton, NY 11973-5000
Office: (631) 344-3265
Email: leofang at bnl.gov<mailto:leofang at bnl.gov>
Website: https://leofang.github.io/<https://urldefense.com/v3/__https:/leofang.github.io/__;!!KGKeukY!hmScmeImOTuuWL8EW62m9cvsuaf3mYaLrA4dqZr4GdDngv87DPvlXZgSc4YaRRCR33t_vGAmB3SVCRs$>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20200720/c36e97b7/attachment.html>


More information about the mvapich-discuss mailing list