[mvapich-discuss] MVAPICH1 Latency Tuning

Mon Mar 15 06:37:42 EDT 2010

Hi Sayantan,

First I have to apologize that we use the Voltaire MPI which is
extended form MVICH, not MVAPICH1. I was under the impression that
MVAPICH1 and MVICH has same technique for IB management.
If that is misunderstanding, I'm very sorry about it.

Next, I have not contact the application user yet.
But I can interviewed some co-worker of him.

1) Do you know how many concurrent outstanding sends your application
has during its iteration steps?
The answer is 1.
The application has GPU kernel function and MPI communication like follows:

GPU_kernel_func1();
MPI_isend(); MPI_irecv();
MPI_wait();

GPU_kernel_func2();
MPI_isend(); MPI_irecv();
MPI_wait();

.....

2) Can you share a code snippet of the inner loop of your application?
We are very sorry. This application is difficult to share for its license.
In addition, I heard , incredibly, the inner loop has 10000 line ...!!!!

We profiled the time of each GPU kernel function and MPI communication.
As the result, we confirm the some GPU kernel function is faster than
MPI communication. We also confirm the MPI message size is <128KB.
Finally we come to consider to reduce the latency of MPI communications.
( Then I change the rendezvous threshold for not to use the rendezvous
protocol. )

Moreover, to our regret, there is dependency between MPI
communications and  GPU_kernel_func after them.

3) Which InfiniBand adapters do you have on your cluster?
Voltaire HCA 410Ex (SDR)

4) What is the node type, number of cores?
The node is Sun Fire X4500 Server (http://www.sun.com/servers/x64/x4500/).
It has 16 CPU cores, but we use only 2 process for GPU management.
the 1 node has a 2 GPU board.

On Fri, Mar 12, 2010 at 12:58 AM, Sayantan Sur <surs at cse.ohio-state.edu> wrote:
> Hi Hideyuki,
>
> On Thu, Mar 11, 2010 at 10:25 AM, Hideyuki Jitsumoto
> <jitumoto at gsic.titech.ac.jp> wrote:
>> Sayantan,
>>
>> Thank you for your reply.
>> I'm sorry about unclear information about the application,
>> because this application is our supercomputer user's.
>>
>> Then I need to ask the user the detail of this application.
>> Probably I'll send you a correct information on Monday.
>>
>
> Looking forward to hearing from you after you get more information
> from the user.
>
>> I'm in flight 10 hours from now, then I'm sorry that I cannot send any
>> information.
>>
>> # I wonder I saw you on a taxi toward MPI Forum ?
>
> Yes! I thought the name sounded familiar :-) It was good meeting you
> at the forum. Hope to meet you at future forums or on the sidelines of
> ICS in Japan later this year.
>
> Thanks.
>
>>
>> Thank you,
>> Hideyuki
>>
>> On Wed, Mar 10, 2010 at 8:18 PM, Sayantan Sur <surs at cse.ohio-state.edu> wrote:
>>> Hi Hideyuki,
>>>
>>> On Wed, Mar 10, 2010 at 2:34 PM, Hideyuki Jitsumoto
>>> <jitumoto at gsic.titech.ac.jp> wrote:
>>>> Hello,
>>>>
>>>> I want to reduce the latency and memory use with MVAPICH1.
>>>> My MPI application has:
>>>> - < 128KB message size
>>>> - 4 peer connection for one process
>>>>
>>>> I make the temporary paramfile as attached file.
>>>> Please tell me other tuning point or wrong parameter on this file.
>>>
>>> Thanks for your question. It is an important question. We would be
>>> happy to assist you with tuning MVAPICH for your application.
>>>
>>> I took a quick look at the param file. Based on what I see, I think
>>> some parameters relating to rendezvous protocol are not optimal at
>>> all. We need to understand your application a little bit more to be
>>> able to help you better. So, a few questions:
>>>
>>> 1) Do you know how many concurrent outstanding sends your application
>>> has during its iteration steps?
>>> 2) Can you share a code snippet of the inner loop of your application?
>>> 3) Which InfiniBand adapters do you have on your cluster?
>>> 4) What is the node type, number of cores?
>>>
>>> This will help us to send you a better send of parameters in the next few days.
>>>
>>>>
>>>> Thank you,
>>>> Hideyuki
>>>> --
>>>> Sincerely Yours,
>>>> Hideyuki Jitsumoto (jitumoto at gsic.titech.ac.jp)
>>>> Tokyo Institute of Technology
>>>> Global Scientific Information and Computing center (Matsuoka Lab.)
>>>>
>>>> _______________________________________________
>>>> mvapich-discuss mailing list
>>>> mvapich-discuss at cse.ohio-state.edu
>>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> Sayantan Sur
>>>
>>> Research Scientist
>>> Department of Computer Science
>>> The Ohio State University.
>>>
>>
>>
>>
>> --
>> Sincerely Yours,
>> Hideyuki Jitsumoto (jitumoto at gsic.titech.ac.jp)
>> Tokyo Institute of Technology
>> Global Scientific Information and Computing center (Matsuoka Lab.)
>>
>>
>
>
>
> --
> Sayantan Sur
>
> Research Scientist
> Department of Computer Science
> The Ohio State University.
>

-- 
Sincerely Yours,
Hideyuki Jitsumoto (jitumoto at gsic.titech.ac.jp)
Tokyo Institute of Technology
Global Scientific Information and Computing center (Matsuoka Lab.)