[mvapich-discuss] EP attributes' values

Lei Chai chai.15 at osu.edu
Mon Jul 7 14:45:01 EDT 2008


Hi Jasjit,

Thanks for using mvapich2. I believe you are using the udapl interface.

When the number of processes is larger than 64, on demand connection 
establishment model is used for better scalability and thus the 
attribute values are different. If this is a problem on your stack, 
could you try to disable on demand by setting the threshold to be larger 
than the number of processes, e.g.

$ mpiexec -n 64 -env MV2_ON_DEMAND_THRESHOLD 1024 ./a.out

 FYI, since the udapl interface in mvapich2 doesn't support the blocking 
progress mode yet, it will not be beneficial to use over-subscription. 
If you are using InfiniBand as the network we recommend you use the OFED 
interface in mvapich2, which provides the best performance, scalability, 
and features, such as blocking mode for over-subscription etc. The 
latest release is mvapich2-1.2rc1.

Lei


jasjit singh wrote:
>  Hi
>
> I am using mvapich2-1.0.1
>
> While running  more than 64 processes on 8 nodes (each with 8 cores, 
> 64-bit, RHEL-2.6.9-42.ELsmp), I have observed some changes in certain 
> attributes.
>
> 1)
> Value of max_rdma_write_iov changes from 0 to 42.
> Value of max_rdma_read_iov also changes from 0 to a non-zero value..
> I want to know why there is such a dramatic change in these values.How 
> should we proceed if we want to run more than 64 processes successfully ?
>
> 2)
> Value of max_message_size attribute in our stack is 4294967296 (i.e 
> 4GB) that is returned in dat_ia_query(). So we are expecting MVAPICH 
> to set the same value for max_message_size while setting DAT_EP_ATTR 
> in EP creation. It is doing so if we run upto 64 processes. But if 
> number of processes exceed 64, MVAPICH sets this value to 1024(i.e 
> 1K). This is again a drastic change. And what is more surprising is it 
> does post recv for size larger than 1K. MVAPICH, it seems, is on one 
> hand limiting MAX MESSAGE SIZE and on the other hand posting larger 
> data size.
>
> I am sure that changes in these values have nothing to do with the 
> number of nodes (or oversubscription, I essentially mean).(CMIIW)
> These changes are only due to increase in number of processes. And one 
> thing more I want to confirm is this has nothing to do with cluster 
> type whether this is small, medium or large as the limit for number of 
> processes for small cluster is 128.
>
> Regards,
> Jasjit Singh
>
>
> ------------------------------------------------------------------------
> Not happy with your email address?
> Get the one you really want <http://uk.docs.yahoo.com/ymail/new.html> 
> - millions of new email addresses available now at Yahoo! 
> <http://uk.docs.yahoo.com/ymail/new..html>
> ------------------------------------------------------------------------
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>   



More information about the mvapich-discuss mailing list