[mvapich-discuss] Node crashes when all memory is used

Christopher Rowley crowl055 at uottawa.ca
Sun Jun 18 22:21:07 EDT 2006


Hi,
 
I'm running a cluster of Opterons with Fedora Core 5. We have topspin
HCA's and Topspin 120 switches. We're using MVAPICH.gen2 to run a
computational chemistry program called VASP. The memory requiresments
are extremely high (60 GB), and occasionally exceed what is available on
the nodes were running on. When this happens, the program is killed, but
in the process, the first node on the list of hosts will crash (it
remains pingable, but with no connectivity or keyboard response). We
don't see this behavior with vanilla MPICH 1.2.7. Is there a known issue
with exceeding the total available memory with MVAPICH?
 
Thanks,
Christopher Rowley
Department of Chemistry
University of Ottawa
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20060618/a793514b/attachment.html


More information about the mvapich-discuss mailing list