[mvapich-discuss] (no subject)

Jonathan Perkins perkinjo at cse.ohio-state.edu
Wed Feb 17 14:33:43 EST 2016


Hello Justin.  The package that you're using requires an HCA to use the GDR
features.  We do provide a package that does not require an HCA for all the
GDR features but unfortunately it is only available for rhel7 at this
time.  Can you try using this package with your system?  If it works we'll
update the site to point out that it can be used with rhel6 as well
(rhel/centos).

http://mvapich.cse.ohio-state.edu/download/mvapich/gdr/2.2b/mvapich2-gdr-regular-ofed-2.2b-cuda-7.5.tar.gz

On Wed, Feb 17, 2016 at 1:21 PM Justin Luitjens <jluitjens at nvidia.com>
wrote:

> --===============4387389260387728233==
> Content-Language: en-US
> Content-Type: multipart/alternative;
>         boundary="_000_631ba403f2354450936ae0e29a0e7365HQMAIL111nvidiacom_"
>
> --_000_631ba403f2354450936ae0e29a0e7365HQMAIL111nvidiacom_
> Content-Type: text/plain; charset="us-ascii"
> Content-Transfer-Encoding: quoted-printable
>
> I'm seeing an odd issue with mvapich2.2b-gdr.  I have installed the RPM p=
> ackage on CentOS 6.4 and I have also installed the GDRCopy library.
>
> I have set the MV2_GPUDIRECT_GDRCOPY_LIB variable and the code runs fine =
> (I'm not using GPUDirect at the moment).
>
> However, as soon as I set the environment variable MV2_USE_CUDA=3D1 my pr=
> ocess hangs at MPI_Init.
>
> For reference I'm only running on a single node which does not have an in=
> finiband nic on it.  I only want to run on a single node and never need t=
> o go cross-node.
>
> Anyone have a work around for this or know what might be going wrong?
>
> Thanks,
> Justin
>
>
> -------------------------------------------------------------------------=
> ----------
> This email message is for the sole use of the intended recipient(s) and m=
> ay contain
> confidential information.  Any unauthorized review, use, disclosure or di=
> stribution
> is prohibited.  If you are not the intended recipient, please contact the=
> =20sender by
> reply email and destroy all copies of the original message.
> -------------------------------------------------------------------------=
> ----------
>
> --_000_631ba403f2354450936ae0e29a0e7365HQMAIL111nvidiacom_
> Content-Type: text/html; charset="us-ascii"
> Content-Transfer-Encoding: quoted-printable
>
> <html xmlns:v=3D"urn:schemas-microsoft-com:vml" xmlns:o=3D"urn:schemas-mi=
> crosoft-com:office:office" xmlns:w=3D"urn:schemas-microsoft-com:office:wo=
> rd" xmlns:m=3D"http://schemas.microsoft.com/office/2004/12/omml" xmlns=3D=
> "http://www.w3.org/TR/REC-html40">
> <head>
> <meta http-equiv=3D"Content-Type" content=3D"text/html; charset=3Dus-asci=
> i">
> <meta name=3D"Generator" content=3D"Microsoft Word 15 (filtered medium)">=
>
> <style><!--
> /* Font Definitions */
> @font-face
> =09{font-family:"Cambria Math";
> =09panose-1:2 4 5 3 5 4 6 3 2 4;}
> @font-face
> =09{font-family:Calibri;
> =09panose-1:2 15 5 2 2 2 4 3 2 4;}
> /* Style Definitions */
> p.MsoNormal, li.MsoNormal, div.MsoNormal
> =09{margin:0in;
> =09margin-bottom:.0001pt;
> =09font-size:11.0pt;
> =09font-family:"Calibri",sans-serif;}
> a:link, span.MsoHyperlink
> =09{mso-style-priority:99;
> =09color:#0563C1;
> =09text-decoration:underline;}
> a:visited, span.MsoHyperlinkFollowed
> =09{mso-style-priority:99;
> =09color:#954F72;
> =09text-decoration:underline;}
> span.EmailStyle17
> =09{mso-style-type:personal-compose;
> =09font-family:"Calibri",sans-serif;
> =09color:windowtext;}
> .MsoChpDefault
> =09{mso-style-type:export-only;
> =09font-family:"Calibri",sans-serif;}
> @page WordSection1
> =09{size:8.5in 11.0in;
> =09margin:1.0in 1.0in 1.0in 1.0in;}
> div.WordSection1
> =09{page:WordSection1;}
> --></style><!--[if gte mso 9]><xml>
> <o:shapedefaults v:ext=3D"edit" spidmax=3D"1026" />
> </xml><![endif]--><!--[if gte mso 9]><xml>
> <o:shapelayout v:ext=3D"edit">
> <o:idmap v:ext=3D"edit" data=3D"1" />
> </o:shapelayout></xml><![endif]-->
> </head>
> <body lang=3D"EN-US" link=3D"#0563C1" vlink=3D"#954F72">
> <div class=3D"WordSection1">
> <p class=3D"MsoNormal">I’m seeing an odd issue with mvapich2.2b-gdr=
> .  I have installed the RPM package on CentOS 6.4 and I have also in=
> stalled the GDRCopy library.
> <o:p></o:p></p>
> <p class=3D"MsoNormal"><o:p> </o:p></p>
> <p class=3D"MsoNormal">I have set the MV2_GPUDIRECT_GDRCOPY_LIB variable =
> and the code runs fine (I’m not using GPUDirect at the moment).<o:p=
> ></o:p></p>
> <p class=3D"MsoNormal"><o:p> </o:p></p>
> <p class=3D"MsoNormal">However, as soon as I set the environment variable=
> =20MV2_USE_CUDA=3D1 my process hangs at MPI_Init.<o:p></o:p></p>
> <p class=3D"MsoNormal"><o:p> </o:p></p>
> <p class=3D"MsoNormal">For reference I’m only running on a single n=
> ode which does not have an infiniband nic on it.  I only want to run=
> =20on a single node and never need to go cross-node.<o:p></o:p></p>
> <p class=3D"MsoNormal"><o:p> </o:p></p>
> <p class=3D"MsoNormal">Anyone have a work around for this or know what mi=
> ght be going wrong?<o:p></o:p></p>
> <p class=3D"MsoNormal"><o:p> </o:p></p>
> <p class=3D"MsoNormal">Thanks,<o:p></o:p></p>
> <p class=3D"MsoNormal">Justin<o:p></o:p></p>
> <p class=3D"MsoNormal"><o:p> </o:p></p>
> </div>
>
> <DIV>
> <HR>
> </DIV>
> <DIV>This email message is for the sole use of the intended recipient(s) =
> and may=20
> contain confidential information.  Any unauthorized review, use, dis=
> closure=20
> or distribution is prohibited.  If you are not the intended recipien=
> t,=20
> please contact the sender by reply email and destroy all copies of the or=
> iginal=20
> message. </DIV>
> <DIV>
> <HR>
> </DIV>
> </body>
> </html>
>
> --_000_631ba403f2354450936ae0e29a0e7365HQMAIL111nvidiacom_--
>
> --===============4387389260387728233==
> Content-Type: text/plain; charset="us-ascii"
> MIME-Version: 1.0
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
> --===============4387389260387728233==--
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20160217/d615e952/attachment-0001.html>


More information about the mvapich-discuss mailing list