[mvapich-discuss] Queue pair usage in mvapich versions

Matthew Koop koop at cse.ohio-state.edu
Mon Jan 12 15:21:30 EST 2009


Sriram,

By default MVAPICH will setup connections "on-demand" when there are more
than 64 processes in a job, so at your scale QPs should only be created
when they are needed. So however processes directly communicate with each
other will need to create QPs. If a process communicates directly with 'n'
peer ranks then it will create 'n' QPs.

How many processes do you have per node? In the past (I'm not sure if it
is changed by default now), there was a 64K limit of QPs per HCA in the
driver. You may want to update your OFED installation if it is an older
version. Thus, an AlltoAll on 8K processes with 8 processes per node would
hit the limit (16 per node would hit even sooner).

If you need extreme scalability you can try using the ch_hybrid device of
MVAPICH, which uses the UD transport of InfiniBand and needs very few QPs
since it is connection-less.

Additionally, what is the maximum lockable memory on the node? (ulimit -l)
QPs must be in pinned memory, so if that limit is not high enough QP
creation could also fail.

Matt

On Mon, 12 Jan 2009, Krishnamoorthy, Sriram wrote:

> Is there a way to query or control the number of queue pairs used by
> mvapich/mvapich2 (in versions 1.1, 1.0.1, and 1.2) per SMP node?
>
> If it is a fixed expression (either all intialized on start-up or later
> on-demand), could you please provide the same or point me to a
> reference?
>
> I am trying to create 2*p queue pairs per SMP node, after initializing
> MPI. Beyond 8192 processes, this is failing in the queue pair create
> call.
>
> Please cc my email id, as I am not subscribed to the mailing list.
>
> Thanks in advance,
> Sriram.K
>
>
>
>



More information about the mvapich-discuss mailing list