[mvapich-discuss] Running LQCD chroma with MVAPICH2 encountered errors

Subramoni, Hari subramoni.1 at osu.edu
Tue Sep 10 11:50:26 EDT 2019


Dear, Ran.

Sorry to hear that you are facing issues. The error seen seems to be a generic startup related error. Did you see any other messages on the screen apart from this?

You seem to be using the basic MVAPICH2 package. For optimized CUDA-Aware MPI solutions, we would recommend using MVAPICH2-GDR package. Could you please try this out and see if it works?

This can be downloaded as an RPM from the following page.

http://mvapich.cse.ohio-state.edu/downloads/

The following section of the MVAPICH2-GDR userguide has more information on how to install the RPM with/without administrative privileges.

http://mvapich.cse.ohio-state.edu/userguide/gdr/#_installing_mvapich2_gdr_library

If you do not see the exact combination of CUDA version, MOFED version and compiler version you are looking for, please let us know and we can build one for you.

On a slightly different note, we would recommend not enabling the core-direct feature.

Best,
Hari.

From: mvapich-discuss-bounces at cse.ohio-state.edu <mvapich-discuss-bounces at mailman.cse.ohio-state.edu> On Behalf Of Ran Du
Sent: Tuesday, September 10, 2019 5:33 AM
To: mvapich-discuss at cse.ohio-state.edu <mvapich-discuss at mailman.cse.ohio-state.edu>
Subject: [mvapich-discuss] Running LQCD chroma with MVAPICH2 encountered errors

Dear experts,

       We are running Lattice QCD software Chroma with MVAPICH2 in our GPU cluster. Each worker node consists of 8 NVIDIA tesla v100 nvlink GPU cards and two EDR IB cards(100 Gbps) interconnected.

       We tried to initiate Chroma with and without GDR on 2 worker nodes(16 GPU cards), however neither is working, could you help us to find what is the root cause for it?

        The detailed configuration and error log files can be found in the attached file, thanks a lot.

Kind regards,
Ran
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20190910/40808b5c/attachment.html>


More information about the mvapich-discuss mailing list