[mvapich-discuss] EP attributes' values
Lei Chai
chai.15 at osu.edu
Mon Jul 7 14:45:01 EDT 2008
Hi Jasjit,
Thanks for using mvapich2. I believe you are using the udapl interface.
When the number of processes is larger than 64, on demand connection
establishment model is used for better scalability and thus the
attribute values are different. If this is a problem on your stack,
could you try to disable on demand by setting the threshold to be larger
than the number of processes, e.g.
$ mpiexec -n 64 -env MV2_ON_DEMAND_THRESHOLD 1024 ./a.out
FYI, since the udapl interface in mvapich2 doesn't support the blocking
progress mode yet, it will not be beneficial to use over-subscription.
If you are using InfiniBand as the network we recommend you use the OFED
interface in mvapich2, which provides the best performance, scalability,
and features, such as blocking mode for over-subscription etc. The
latest release is mvapich2-1.2rc1.
Lei
jasjit singh wrote:
> Hi
>
> I am using mvapich2-1.0.1
>
> While running more than 64 processes on 8 nodes (each with 8 cores,
> 64-bit, RHEL-2.6.9-42.ELsmp), I have observed some changes in certain
> attributes.
>
> 1)
> Value of max_rdma_write_iov changes from 0 to 42.
> Value of max_rdma_read_iov also changes from 0 to a non-zero value..
> I want to know why there is such a dramatic change in these values.How
> should we proceed if we want to run more than 64 processes successfully ?
>
> 2)
> Value of max_message_size attribute in our stack is 4294967296 (i.e
> 4GB) that is returned in dat_ia_query(). So we are expecting MVAPICH
> to set the same value for max_message_size while setting DAT_EP_ATTR
> in EP creation. It is doing so if we run upto 64 processes. But if
> number of processes exceed 64, MVAPICH sets this value to 1024(i.e
> 1K). This is again a drastic change. And what is more surprising is it
> does post recv for size larger than 1K. MVAPICH, it seems, is on one
> hand limiting MAX MESSAGE SIZE and on the other hand posting larger
> data size.
>
> I am sure that changes in these values have nothing to do with the
> number of nodes (or oversubscription, I essentially mean).(CMIIW)
> These changes are only due to increase in number of processes. And one
> thing more I want to confirm is this has nothing to do with cluster
> type whether this is small, medium or large as the limit for number of
> processes for small cluster is 128.
>
> Regards,
> Jasjit Singh
>
>
> ------------------------------------------------------------------------
> Not happy with your email address?
> Get the one you really want <http://uk.docs.yahoo.com/ymail/new.html>
> - millions of new email addresses available now at Yahoo!
> <http://uk.docs.yahoo.com/ymail/new..html>
> ------------------------------------------------------------------------
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
More information about the mvapich-discuss
mailing list