[mvapich-discuss] Poor scale-up from 16 to 64 processors!!

Divi Venkateswarlu divi at ncat.edu
Fri Jun 6 19:03:53 EDT 2008


      Hello all:

      I have just built a 64-core cluster and the following is my setup.

       8 DP quad-core machines running ROCKS-5 and MVAPICH-1.0
       I am using 8-port flextronic switch (it is SDR switch) and the cards
       are MHES18 (10 GB/sec HCA cards)

       I have the following questions.

       How do I know if my computation is using IB network or ethernetwork?
       I named each IB card   "fast1 .... fast8.  
       I created a host file  with  8-copies of each of fast1, fast2.....fast8.
       
       The details of 8 nodes with IB config below (only two shown)..

        HOST         SUBNET  IFACE      IP                 NETMASK            NAME
      divilab:          ibnet          ib0         20.1.1.1       255.255.255.0          fast1
      ................................................................................................................
      ................................................................................................................
     compute-0-6:  ibnet           ib0        20.1.1.8       255.255.255.0          fast8

       I do not see any scale-up from 16 to 32 to 64 processes.

      One benchmark of MD simulation (for one picosecond) of a protein (FIXa) is
      given below:
      
      The MD code is PMEMD/MVAPICH with IFORT/MKL compilation.

       # of CPUs/cores   Time (sec)         Nodes (load-balanced)
                 8                 82                      8
               16                 49                      8
               32                 42                      8
               64                 39                      8

       I am suspecting that I might have not set up something right  or  SDR  switch/card limitations...
       definitely not happy with poor scale-up...

       I used all default values of  make.mvapich.gen2  (with intel fortran 9.0). 
       There seems too many options in this script. Not sure what most of them would do, therefore, just 
        let the script run as such.

       Could somebody offer some help on how to fix/improve the scaling?

       Thanks a lot...
       Divi

----- Original Message ----- 
From: "Joshua Bernstein" <jbernstein at penguincomputing.com>
To: "Karthik Gopalakrishnan" <gopalakk at cse.ohio-state.edu>
Cc: "Divi Venkateswarlu" <divi at ncat.edu>; 
<mvapich-discuss at cse.ohio-state.edu>
Sent: Friday, June 06, 2008 4:59 PM
Subject: Re: [mvapich-discuss] how to set "ulimit -l unlimited" at user 
level?


> Also,
>
> If you are running AMBER with using SSH, you will want to add the 
> ulimit -l command to your /etc/init.d/sshd startup script on the nodes. 
> That way any proces forked by SSH on the compute node will inherit that 
> setting and hence allow AMBER to run.
>
> -Joshua Bernstein
> Software Engineer
> Penguin Computing
>
> Karthik Gopalakrishnan wrote:
>> Try adding the "ulimit -c unlimited" line to /etc/profile first.
>>
>> Regards,
>> Karthik
>>
>> On Tue, Jun 3, 2008 at 1:36 PM, Divi Venkateswarlu <divi at ncat.edu> wrote:
>>>     Hello:
>>>
>>>     I am running ROCKS-5 on two DP quad-core machines with mellanox IB 
>>> HCA
>>> cards.
>>>     I compiled mvapich with ifort without any problems.
>>>
>>>     I am able to run at root level with NO problems. I could set 
>>> ulimit -l
>>> unlimited to increase
>>>     RLIMIT_MEMLOCK size.  My program (PMEMD of AMBER package) runs on 
>>> all
>>>     16-cores with no hiccups.
>>>
>>>     When I try to set   ulimit -l unlimited at user level, I get the
>>> following error message.
>>>
>>>       -bash: ulimit: max locked memory: cannot modify limit: Operation 
>>> not
>>> permitted.
>>>
>>>    Can somebody help me how to fix this problem?    I am running 
>>> mvapich-1.0
>>>
>>>    Thanks a lot for your help
>>>
>>>     Divi
>>>
>>>
>>> _______________________________________________
>>> mvapich-discuss mailing list
>>> mvapich-discuss at cse.ohio-state.edu
>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>
>>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss at cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20080606/04a339c4/attachment-0001.html


More information about the mvapich-discuss mailing list