[mvapich-discuss] Large jitter in message transfer

Sashi Balasingam sashibala2 at yahoo.com
Fri Apr 15 12:46:22 EDT 2011


Hi all,
I recently started on an MPI-based, 'real-time', pipelined-processing application, and the application fails due to large time-jitter in sending and receiving messages. 
Here are related info -
 
1) Platform:
a) Intel Box: Two Hex-core, Intel Xeon, 2.668 GHz (...total of 12 cores/box)
b) OS: SUSE Linux Enterprise Server 11 (x86_64) - Kernel \r (\l)
c) MPI Rev:  mvapich2-1.6rc2
d) HCA: InfiniBand: Mellanox Technologies MT26428 [ConnectX IB QDR, PCIe 2.0 5GT/s] (rev a0)
 
2) Application detail:
 
a) My app uses two of these boxes, with IB connection. I launch 11 processes for a "Pipelined" Image processing application, where Box-1 has 7, and Box-2 has 4 processes.
 
b) Each process waits for a message (sizes vary between 1 KBytes to 26 KBytes), 
then process the data, and outputs a message (sizes vary between 1 KBytes to 26 KBytes), to next process. 
There is an input pkt every 80 micro-sec into this pipeline, and typically a job will entails about 500,000 pkts.
 
c) MPI transport functions used : "MPI_Isend", MPI_Irecv, MPI_Test. 
   i)  For Receiving messages, I first make an MPI_Irecv call, followed by a busy-loop on MPI_Test, waiting for message
   ii) For Sending message, there is a busy-loop on MPI_Test to ensure prior buffer was sent, then use MPI_Isend.
 
d) When the job starts, all these 11 process are put in High priority mode ( SCHED_FIFO policy, with priority setting of 99). 
The Job entails an input data packet stream (and a series of MPI messages), continually at 40 micro-sec rate, for a few minutes.    
 
3) The Problem:
a) Once the job starts, about 10% of MPI data xfer occurring between the two boxes have a large jitter, ranging from 8 ~ 30 millisec. 
This causes some of my internal application input queues to fill-up and cause a failure.
 
b) I have used a few basic tools to look at CPU usage during job running, nothing significant is running other than my app.
 
- Any suggestions on improved MPI config settings or OS config/issues will be much appreciated.
 
- Is there any performance monitoring tool that can monitor CPU activity for a few seconds, at 1 milli-sec level resolution ?
 
Thanks in advance.SashiBala
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20110415/49338b7e/attachment-0001.html


More information about the mvapich-discuss mailing list