[mvapich-discuss] MTU size on homogenous mlx4 QDR systems

Bernd Kallies kallies at zib.de
Fri Dec 11 14:18:16 EST 2009


We use "old" MT25208 InfiniHost III and MT26418 ConnectX (4x DDR per
port, 2 ports per host) with 64k MTUs, as shipped by our vendor.

Problems were seen with this on machines having global file systems
(NFS, Lustre) attached via IB, when they have workload common for
desktop machines (many users, many different usage profiles at the same
time like compile, zip, tar, scp, ...). Then memory fragmentation may
occur during file I/O operations, which lets kswapd come into play. When
running Linux kernels with leaking kswapd (seen e.g. with
2.6.16.60-0.21-smp for x86_64 by Novell), only reduction down to 16k MTU
or less helped to get rid of the problem.

On compute nodes dedicated to usual HPC batch jobs 64k MTUs work fine
for MPI and I/O.

Sincerely BK

On Fri, 2009-12-11 at 18:19 +0100, Christian Guggenberger wrote:
> Dear all,
> 
> this is probably a bit offtopic for this list. However, mvapich developers
> and users could hopefully give some answers if enabling 4K MTUs
> on recent mlx4 adapters is a good idea or not.
> 
> cheers.
>  - Christian
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
-- 
Dr. Bernd Kallies
Konrad-Zuse-Zentrum für Informationstechnik Berlin
Takustr. 7
14195 Berlin
Tel: +49-30-84185-270
Fax: +49-30-84185-311
e-mail: kallies at zib.de




More information about the mvapich-discuss mailing list