[mvapich-discuss] Poor scale-up from 16 to 64 processors!!
Divi Venkateswarlu
divi at ncat.edu
Fri Jun 6 19:03:53 EDT 2008
Hello all:
I have just built a 64-core cluster and the following is my setup.
8 DP quad-core machines running ROCKS-5 and MVAPICH-1.0
I am using 8-port flextronic switch (it is SDR switch) and the cards
are MHES18 (10 GB/sec HCA cards)
I have the following questions.
How do I know if my computation is using IB network or ethernetwork?
I named each IB card "fast1 .... fast8.
I created a host file with 8-copies of each of fast1, fast2.....fast8.
The details of 8 nodes with IB config below (only two shown)..
HOST SUBNET IFACE IP NETMASK NAME
divilab: ibnet ib0 20.1.1.1 255.255.255.0 fast1
................................................................................................................
................................................................................................................
compute-0-6: ibnet ib0 20.1.1.8 255.255.255.0 fast8
I do not see any scale-up from 16 to 32 to 64 processes.
One benchmark of MD simulation (for one picosecond) of a protein (FIXa) is
given below:
The MD code is PMEMD/MVAPICH with IFORT/MKL compilation.
# of CPUs/cores Time (sec) Nodes (load-balanced)
8 82 8
16 49 8
32 42 8
64 39 8
I am suspecting that I might have not set up something right or SDR switch/card limitations...
definitely not happy with poor scale-up...
I used all default values of make.mvapich.gen2 (with intel fortran 9.0).
There seems too many options in this script. Not sure what most of them would do, therefore, just
let the script run as such.
Could somebody offer some help on how to fix/improve the scaling?
Thanks a lot...
Divi
----- Original Message -----
From: "Joshua Bernstein" <jbernstein at penguincomputing.com>
To: "Karthik Gopalakrishnan" <gopalakk at cse.ohio-state.edu>
Cc: "Divi Venkateswarlu" <divi at ncat.edu>;
<mvapich-discuss at cse.ohio-state.edu>
Sent: Friday, June 06, 2008 4:59 PM
Subject: Re: [mvapich-discuss] how to set "ulimit -l unlimited" at user
level?
> Also,
>
> If you are running AMBER with using SSH, you will want to add the
> ulimit -l command to your /etc/init.d/sshd startup script on the nodes.
> That way any proces forked by SSH on the compute node will inherit that
> setting and hence allow AMBER to run.
>
> -Joshua Bernstein
> Software Engineer
> Penguin Computing
>
> Karthik Gopalakrishnan wrote:
>> Try adding the "ulimit -c unlimited" line to /etc/profile first.
>>
>> Regards,
>> Karthik
>>
>> On Tue, Jun 3, 2008 at 1:36 PM, Divi Venkateswarlu <divi at ncat.edu> wrote:
>>> Hello:
>>>
>>> I am running ROCKS-5 on two DP quad-core machines with mellanox IB
>>> HCA
>>> cards.
>>> I compiled mvapich with ifort without any problems.
>>>
>>> I am able to run at root level with NO problems. I could set
>>> ulimit -l
>>> unlimited to increase
>>> RLIMIT_MEMLOCK size. My program (PMEMD of AMBER package) runs on
>>> all
>>> 16-cores with no hiccups.
>>>
>>> When I try to set ulimit -l unlimited at user level, I get the
>>> following error message.
>>>
>>> -bash: ulimit: max locked memory: cannot modify limit: Operation
>>> not
>>> permitted.
>>>
>>> Can somebody help me how to fix this problem? I am running
>>> mvapich-1.0
>>>
>>> Thanks a lot for your help
>>>
>>> Divi
>>>
>>>
>>> _______________________________________________
>>> mvapich-discuss mailing list
>>> mvapich-discuss at cse.ohio-state.edu
>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>
>>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20080606/04a339c4/attachment-0001.html
More information about the mvapich-discuss
mailing list