[mvapich-discuss] Running code hanged using MVAPICH
Soon-Heum Ko
floydfan at innogrid.com
Wed Oct 22 00:54:05 EDT 2008
Hi,
I'm working at KISTI Supercomputing Center, Korea, as a member of SUN support team.
I have some trouble using MVAPICH. So, I'd like to ask you the option for installing MVAPICH.
Recently, we made a supercomputer which consists of Sun blade 6048 nodes (4 quadcore Barcelona CPUs at each node) and Voltaire ISR 2012 switch for infiniband network. Admins installed MVAPICH in this system and I tested the operation of MVAPICH library.
While running various codes, I found that some codes don't work with MVAPICH though they don't make any trouble with other MPI libraries. Particularly, these codes work well with MVAPICH when I use several processors in the same node; but they don't work if more than 2 nodes cooperate. In detail, code is hung at the first MPI communication command. (i.e., when I use 16 processors in the same node, it works well... If I use 16 processors in 2 nodes, it pends in the MPI communication routine...)
Mysterious thing is that, in some simple codes, MPI communication routines make no trouble in inter-node communication. On the other hand, some complex codes that spend vast size memory show communication tourble even when they transfer only 4 bytes data (one integer).
Just in my imagination, this happened by one of these reasons:
- MVAPICH trouble : which is fixed in latest version. (Note that we currently use MVAPICH Version 1.0.1)
- MVAPICH trouble : which has not been reported or not been fixed.
- Installing option : our installation option is as follows.
./configure --with-device=ch_gen2 --with-arch=LINUX -prefix=/usr/local/mvapich \
--enable-shared --enable-static --enable-debug --enable-sharedlib \
--enable-cxx --enable-f77 --enable-f90 --enable-f90modules \
--with-romio --without-mpe
- Bugs on users' codes : which cannot happen.
Do you have any idea or comments? If you know the reason, please let me know.
Thank you in advance.
Best regards,
Jeff
Soon-Heum Ko,
Ph.D, Computational Fluid Dynamics
Parallel Optimization Analyst,
SUN Support Team (InnoGrid) at KISTI
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20081022/d39affca/attachment.html
More information about the mvapich-discuss
mailing list