[mvapich-discuss] mvapich and gromacs4.6

Jiri Kraus jkraus at nvidia.com
Mon Nov 18 09:04:04 EST 2013


Dear Yang,

the difference between mpirun with the machine file and mpirun_rsh is that the machine file contains each node only once and you are only starting 2 processes if you change your machine file to node2, node2, node3, node3 and start with

mpirun -machinefile nodes -np 4 mdrun_mpi

it should behave like

mpi_rsh -np 4 node2 node2 node3 node3

Regarding the environment variable you should specify for GROMACS: I am sorry but I do not get your question. I general I would say that the GROMACS mailing lists can help you better with stating GROMACS than this list.

Hope this helps

Jiri

Sent from my Nexus 7, I apologize for spelling errors and auto correction typos.

-----Original Message-----
From: hpc at lzu.edu.cn [hpc at lzu.edu.cn]
Received: Montag, 18 Nov. 2013, 1:05
To: Jiri Kraus [jkraus at nvidia.com]
CC: sreeram potluri [potluri.2 at osu.edu]; mvapich-discuss [mvapich-discuss at cse.ohio-state.edu]
Subject: Re: AW: [mvapich-discuss] mvapich and gromacs4.6

Dear  Jiri

    My error caused I use the mpirun -machinefile nodes -np 2 mdrun_mpi,nodes file is node2,node3, I change it with mpi_rsh -np 4 node2 node2 node3 node3 mdrun_mpi,now it's work. I do not know what difference between the mpirun and mpi_rsh.

   I have a question.when I  compile the mvapich with infiniband, I need specify the parameters? my infiniband is defacult install. if mvapich is install finished. what environment variable should specify  for gromacs? Thanks!


-----原始邮件-----
发件人: "Jiri Kraus" <jkraus at nvidia.com>
发送时间: 2013-11-18 12:01:16 (星期一)
收件人: "hpc at lzu.edu.cn" <hpc at lzu.edu.cn>
抄送: "sreeram potluri" <potluri.2 at osu.edu>, mvapich-discuss <mvapich-discuss at cse.ohio-state.edu>
主题: AW: [mvapich-discuss] mvapich and gromacs4.6

Dear Yang,

the error message

"Incorrect launch configuration: mismatching number of PP MPI processes and GPUs per node.
mdrun_mpi was started with 1 PP MPI process per node, but you provided 2 GPUs."

indicates that you are requesting GROMACS to use the two GPUs in the node but only start one MPI process per node. If you use the mpirun_rsh launcher you can do that by specifying each node that should be used twice. E.g. if you want to run with 4 processes on 2 nodes which are named node0 and node1 do

mpirun_rsh -np 4 node0 node0 node1 node1 mdrun_mpi ...

If you use another launcher and depending on how your MVAPICH build is integrated with a possibly running batch system the procedure might be different.

If that does not fix the issue please send the complete commandline with which you try to start GROMACS, e.g.

mpirun_rsh -np 8 mdrun_mpi -v

Jiri

Sent from my smartphone, I apologize for spelling errors and auto correction typos.

----- Reply message -----
Von: "hpc at lzu.edu.cn<mailto:hpc at lzu.edu.cn>" <hpc at lzu.edu.cn<mailto:hpc at lzu.edu.cn>>
An: "Jiri Kraus" <jkraus at nvidia.com<mailto:jkraus at nvidia.com>>
Cc: "sreeram potluri" <potluri.2 at osu.edu<mailto:potluri.2 at osu.edu>>, "mvapich-discuss" <mvapich-discuss at cse.ohio-state.edu<mailto:mvapich-discuss at cse.ohio-state.edu>>
Betreff: [mvapich-discuss] mvapich and gromacs4.6
Datum: So., Nov 17, 2013 18:24


Dear Jiri

    Thanks! the output error is:

Error on node 0, will try to stop all the nodes
Halting parallel program mdrun_mpi on CPU 0 out of 2

gcq#319: "Your Country Raised You, Your Country Fed You, and Just Like Any Other Country it Will Break You" (Gogol Bordello)

[cli_0]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 0

-------------------------------------------------------
Program mdrun_mpi, VERSION 4.6.3
Source code file: /home/*/gromacs-4.6.3/src/gmxlib/gmx_detect_hardware.c, line: 349

Fatal error:
Incorrect launch configuration: mismatching number of PP MPI processes and GPUs per node.
mdrun_mpi was started with 1 PP MPI process per node, but you provided 2 GPUs.
For more information and tips for troubleshooting, please check the GROMACS
website at http://www.gromacs.org/Documentation/Errors
-------------------------------------------------------

"We Can Dance Like Iggy Pop" (Red Hot Chili Peppers)

Error on node 1, will try to stop all the nodes
Halting parallel program mdrun_mpi on CPU 1 out of 2

gcq#5: "We Can Dance Like Iggy Pop" (Red Hot Chili Peppers)

[cli_1]: aborting job:
application called MPI_Abort(MPI_COMM_WORLD, -1) - process 1



-----原始邮件-----
发件人: "Jiri Kraus" <jkraus at nvidia.com<mailto:jkraus at nvidia.com>>
发送时间: 2013-11-17 11:20:48 (星期日)
收件人: "sreeram potluri" <potluri.2 at osu.edu<mailto:potluri.2 at osu.edu>>, "hpc at lzu.edu.cn<mailto:hpc at lzu.edu.cn>" <hpc at lzu.edu.cn<mailto:hpc at lzu.edu.cn>>
抄送: mvapich-discuss <mvapich-discuss at cse.ohio-state.edu<mailto:mvapich-discuss at cse.ohio-state.edu>>
主题: AW: [mvapich-discuss] mvapich and gromacs4.6

Dear Yang,

please also add the output of mdrun. GROMACS supports one GPU per MPI rank so you need to start as many ranks per node as there are GPUs in a node to utilize all GPUs.

Hope this helps

Jiri

Sent from my smartphone, I apologize for spelling errors and auto correction typos.

----- Reply message -----
Von: "sreeram potluri" <potluri.2 at osu.edu<mailto:potluri.2 at osu.edu>>
An: "hpc at lzu.edu.cn<mailto:hpc at lzu.edu.cn>" <hpc at lzu.edu.cn<mailto:hpc at lzu.edu.cn>>
Cc: "mvapich-discuss" <mvapich-discuss at cse.ohio-state.edu<mailto:mvapich-discuss at cse.ohio-state.edu>>
Betreff: [mvapich-discuss] mvapich and gromacs4.6
Datum: Sa., Nov 16, 2013 18:11

Dear Yang,

To make sure I understand you correctly, you are seeing an issue when running Gromacs across two or more GPUs on the same node, using MVAPICH2?

MVAPICH2 does has support for GPU-GPU MPI communication on multi-GPU nodes. Can you give some more information below about the issue?

the configure options you used to build the library

the command you are using for the run

a backtrace, if its a hang, or the error, if its a failure

configuration of the nodes

This will help us see what could be going on.

Best
Sreeram Potluri


On Fri, Nov 15, 2013 at 7:28 AM, <hpc at lzu.edu.cn<mailto:hpc at lzu.edu.cn>> wrote:
Dear All

   I want to install gromacs4.6 at infiniband network.I installed the mvapich2-2.0a, gromacs4.6 support the GPUs,but when I running mpirun with mvapich,I found mvapich is only running one GPU, if I use two or more GPU, gromacs is not running. Anyone meet this problem? openmpi is support muliti-GPU.I don not know if the mvapich is only support one GPU. Maybe I install process have problem. anyone have suggestion for me, I am very appreciate. thanks!

Best wishes


yang

_______________________________________________
mvapich-discuss mailing list
mvapich-discuss at cse.ohio-state.edu<mailto:mvapich-discuss at cse.ohio-state.edu>
http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss




-----------------------------------------------------------------------------------
Nvidia GmbH
Würselen
Amtsgericht Aachen
HRB 8361
Managing Director: Karen Theresa Burns

-----------------------------------------------------------------------------------
This email message is for the sole use of the intended recipient(s) and may contain
confidential information.  Any unauthorized review, use, disclosure or distribution
is prohibited.  If you are not the intended recipient, please contact the sender by
reply email and destroy all copies of the original message.
-----------------------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20131118/57d8d836/attachment.html>


More information about the mvapich-discuss mailing list