[mvapich-discuss] program hanged using mvapich with large number of processes

Jonathan Perkins perkinjo at cse.ohio-state.edu
Thu Jan 28 08:25:52 EST 2010


On Thu, Jan 28, 2010 at 07:11:27PM +0800, Weimin Wang wrote:
> On Sat, Jan 23, 2010 at 10:18 PM, Dhabaleswar Panda wrote:
> > You are using the uDAPL interface of MVAPICH2 stack. All our designs and
> > developments with latest features are taking place on the
> > most-commonly-used OpenFabrics-Gen2 (IB/iWARP) interface. You should start
> > using this interface to get the best performance and scalability on your
> > cluster. You can use this interface and let us know whether you see the
> > problem or not.
>
> Hello, Dhabaleswar,
> I have solved the problem following your advice. I recompiled mvapich2 with
> gen2 option and could start 80 processes now. Thank you very much!
> 
> However, I got another questions. When I run the job in node73 which is a
> node for user log-in, everything is fine. However, I got an error for other
> nodes:
> 
> wmwang at node2:~/test> /data02/home/wmwang/test/mvapich2/bin/mpicc -o cpi
> cpi.c
> /usr/bin/ld: cannot find -lrdmacm
> 
> It may be due to that the librdmacm is not in publich directory for all
> nodes. I have installed this libradmacm in the publich directory. Here is my
> question: how could I add library directories for mvapich2?

If librdmacm is installed and available at the same place on each of the
nodes as the one you installed mvapich2 on then this problem should not
happen.  You may want to check that librdmacm.so* files exist on the
other machines.

To answer your question, you can use `--with-ib-include' and
`--with-ib-libpath' options at configure time if you need to use
infiniband libraries that are not found in the default system locations.

Example:
    ./configure --with-ib-include=/opt/ofed/include --with-ib-libpath=/opt/ofed/lib64

-- 
Jonathan Perkins
http://www.cse.ohio-state.edu/~perkinjo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
Url : http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20100128/77497cfc/attachment.bin


More information about the mvapich-discuss mailing list