[mvapich-discuss] optimizing MPI communications on Ranger

Thu Mar 13 22:42:00 EDT 2008

Hi, Marco,

The communication pattern you described are very common situations that we
are handling, so I would say they are equally optimized in both mvapich
and mvapich2. I would suggest using MPI_Isend though.

If you want to play with some of the adjustable internal parameters we are
using, you can start with number of communication buffers
("VIADEV_VBUF_POOL_SIZE" in mvapich and "MV2_VBUF_POOL_SIZE" in mvapich2),
as well as the length of the receiving queue in mvapich2 ("MV2_SRQ_SIZE").
They are all environment variables. Specific to Ranger, you can just
export them in your job description file (e.g., put a line like
"export MV2_SRQ_SIZE=5000").

For more details on those variables, you can refer to our userguides:

http://mvapich.cse.ohio-state.edu/support/mvapich_user_guide.pdf
http://mvapich.cse.ohio-state.edu/support/mvapich2_user_guide.pdf

There will be sections on MVAPICH parameters or MVAPICH2 paramters.

Thanks.

Regards,
Wei Huang

774 Dreese Lab, 2015 Neil Ave,
Dept. of Computer Science and Engineering
Ohio State University
OH 43210
Tel: (614)292-8501

On Thu, 13 Mar 2008, md.mazzeo at libero.it wrote:

> Hello
> I would like to optimize communications of my MPI application on Ranger platform which has installed mvapich2.
>
> I have read the manual on mvapich2, but I m not expert enough in the field to easily extract very useful and robust info, so I thought to ask something to you.
>
> My application has 2 main communication cores:
>
> FIRST COMMUNICATION CORE (for each iteration):
>
> each cpu/core exchanges a small message (0.1-10KB, small messages
> arise when each subdomain is small, i.e. system_size/#cores is low).
> The point-to-point communications are non-blocked so it is very
> important to OVERLAP NON BLOCKING COMMUNICATION OF SMALL MESSAGES TO
> COMPUTATION eventually at the expense of some latency..
>
>
> SECOND COMMUNICATION CORE (for each iteration):
>
> the communication pattern is a binary tree which involves all the cores and in which at each stage the message to exchange between communicating cores grows.
> For example, for 8 cores, at the first stage the communicating pairs are
> 1-2, 3-4, 5-6, 7-8 (small size messages)
> at the second stage
> 1-3,5-7 (medium size message)
> and the last stage
> 1-5 (large message, tipically ~10 MB)
>
> The important thing is to
> OPTIMISE THE BLOCKING COMMUNICATION OF MEDIUM-LARGE MESSAGES
>
> -----
> So I was wondering which parameters and directives I should use to optimise the two communication patterns. The parameters space is very large and it is very difficult to find a good combination.
>
> First of all,
> MPI_Isend or MPI_Issend ?
> MVAPICH vs MVAPICH2 ?
>
> I would be also very grateful to have an outline on which MVAPICH(2) params could optimise my application without directly doing a large number of benchmarks.
>
> Kind Regards
> marco
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>