From vithaln at serverengines.com Sun Jul 2 23:08:34 2006
From: vithaln at serverengines.com (vithaln)
Date: Mon Jul 3 19:07:14 2006
Subject: [mvapich-discuss] MVAPICH2 over uDAPL...
In-Reply-To: <200606301642.k5UGgFo2004104@xi.cse.ohio-state.edu>
Message-ID: <20060703030649.DD5CC5E0C0@barracuda2.cse.ohio-state.edu>
Thanks a lot Panda.
Will check this doc and come back
Rgrds
Vithal
-----Original Message-----
From: Dhabaleswar Panda [mailto:panda@cse.ohio-state.edu]
Sent: Friday, June 30, 2006 10:12 PM
To: vithaln
Cc: mvapich-discuss@cse.ohio-state.edu
Subject: Re: [mvapich-discuss] MVAPICH2 over uDAPL...
Hi,
Thanks for your note.
> I could not find any information in user guide regarding running mvapich2
on
> uDAPL. The guide I am using is mpich2-doc-user.pdf downloaded from
> sourceforge.net.
>
> However, I could find the udapl script in MVAPICH2 directory. Could some
one
> point me to the write doc which details how to run mpi programs/tests on
> uDAPL?
It looks like you are looking for the MVAPICH2 user guide at the wrong
place. The above-mentioned user guide is for MPICH2 not MVAPICH2.
It is accessible from MVAPICH2 web page (either from the FAQ link or
from the Download link). The URL is:
http://nowlab.cse.ohio-state.edu/projects/mpi-iba/download-mvapich2/mvapich2
_user_guide.html
This user guide contains in-depth instructions on how to install and
use MVAPICH2 with uDAPL.
> Also, got few questions
>
> 1. Can we run the MPI programs on standalone linux system ?
Sure.
> 2. My requirement is to run MPI programs on two different machines
> exchanging data across in the network. Is there any document that gives
info
> on how to go about this or did any one did this earlier?
It depends on how you plan to exchange the data and what is the
underlying network and interface for these two different machines. You
will find all available options (networks and interfaces) available
options with MVAPICH2.
DK
>
>
> Rgrds
>
> Vithal
>
>
> ------=_NextPart_000_0007_01C69C54.5AB316F0
> Content-Type: text/html;
> charset="us-ascii"
> Content-Transfer-Encoding: quoted-printable
>
> xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
> xmlns=3D"http://www.w3.org/TR/REC-html40">
>
>
> charset=3Dus-ascii">
>
>
>
>
>
>
>
>
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>Hello,
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>I could not find any information in user guide =
> regarding
> running mvapich2 on uDAPL. The guide I am using is mpich2-doc-user.pdf
> downloaded from sourceforge.net.
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>However, I could find the udapl script in MVAPICH2
> directory. Could some one point me to the write doc which details how to =
> run mpi
> programs/tests on uDAPL?
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>Also, got few questions
>
>
> - face=3DArial> style=3D'font-size:10.0pt;font-family:Arial'>Can we run the MPI =
> programs on
> standalone linux system ?
> - face=3DArial> style=3D'font-size:10.0pt;font-family:Arial'>My requirement is to =
> run MPI
> programs on two different machines exchanging data across in the =
> network.
> Is there any document that gives info on how to go about this or =
> did any
> one did this earlier?
>
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>Rgrds
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>Vithal
>
>
>
>
>
>
>
> ------=_NextPart_000_0007_01C69C54.5AB316F0--
>
>
>
> --===============1353028203==
> Content-Type: text/plain; charset="us-ascii"
> MIME-Version: 1.0
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss@cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
> --===============1353028203==--
>
>
From vithaln at serverengines.com Sun Jul 2 23:46:46 2006
From: vithaln at serverengines.com (vithaln)
Date: Mon Jul 3 19:07:15 2006
Subject: [mvapich-discuss] MVAPICH2 over uDAPL...
In-Reply-To: <200606301642.k5UGgFo2004104@xi.cse.ohio-state.edu>
Message-ID: <20060703034457.A80305E11E@barracuda2.cse.ohio-state.edu>
> 2. My requirement is to run MPI programs on two different machines
> exchanging data across in the network. Is there any document that gives
info
> on how to go about this or did any one did this earlier?
>It depends on how you plan to exchange the data and what is the
>underlying network and interface for these two different machines. You
>will find all available options (networks and interfaces) available
>options with MVAPICH2.
I am assuming the following path MPI2 -> uDAPL -> TCP/IP on LAN.
I have two linux systems connected via a HUB (on my corporate network).
Wanted to run mpi-exec to initiate some kind of data transfer(probably some
file transfers.. may be ftp?!) across the two systems. I wanted to study the
performance impact on employing uDAPL over TCP (ideally RDMA over TCP) as
the transport interface. Let me know if my understanding and the attempt are
on the wrong foot!
Some thing as given below
---------------------------
| Application for File XFR|
---------------------------
|
-----------
| Udapl |
-----------
|
User Space |
-----------------------------
Kernel Space |
|
-----------
| TCP |
-----------
|
-----------------------------
NIC H/w |
HARDWARE
|
---------------------------
N/W |
LAN
I know that, RDMA over TCP brings in a number of issues pertaining to
restricting TCP to meet RDMA requirements. But if uDAPL is successful, I
believe those issues are overcome!
Thanks and Rgrds
Vithal
-----Original Message-----
From: Dhabaleswar Panda [mailto:panda@cse.ohio-state.edu]
Sent: Friday, June 30, 2006 10:12 PM
To: vithaln
Cc: mvapich-discuss@cse.ohio-state.edu
Subject: Re: [mvapich-discuss] MVAPICH2 over uDAPL...
Hi,
Thanks for your note.
> I could not find any information in user guide regarding running mvapich2
on
> uDAPL. The guide I am using is mpich2-doc-user.pdf downloaded from
> sourceforge.net.
>
> However, I could find the udapl script in MVAPICH2 directory. Could some
one
> point me to the write doc which details how to run mpi programs/tests on
> uDAPL?
It looks like you are looking for the MVAPICH2 user guide at the wrong
place. The above-mentioned user guide is for MPICH2 not MVAPICH2.
It is accessible from MVAPICH2 web page (either from the FAQ link or
from the Download link). The URL is:
http://nowlab.cse.ohio-state.edu/projects/mpi-iba/download-mvapich2/mvapich2
_user_guide.html
This user guide contains in-depth instructions on how to install and
use MVAPICH2 with uDAPL.
> Also, got few questions
>
> 1. Can we run the MPI programs on standalone linux system ?
Sure.
> 2. My requirement is to run MPI programs on two different machines
> exchanging data across in the network. Is there any document that gives
info
> on how to go about this or did any one did this earlier?
It depends on how you plan to exchange the data and what is the
underlying network and interface for these two different machines. You
will find all available options (networks and interfaces) available
options with MVAPICH2.
DK
>
>
> Rgrds
>
> Vithal
>
>
> ------=_NextPart_000_0007_01C69C54.5AB316F0
> Content-Type: text/html;
> charset="us-ascii"
> Content-Transfer-Encoding: quoted-printable
>
> xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
> xmlns=3D"http://www.w3.org/TR/REC-html40">
>
>
> charset=3Dus-ascii">
>
>
>
>
>
>
>
>
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>Hello,
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>I could not find any information in user guide =
> regarding
> running mvapich2 on uDAPL. The guide I am using is mpich2-doc-user.pdf
> downloaded from sourceforge.net.
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>However, I could find the udapl script in MVAPICH2
> directory. Could some one point me to the write doc which details how to =
> run mpi
> programs/tests on uDAPL?
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>Also, got few questions
>
>
> - face=3DArial> style=3D'font-size:10.0pt;font-family:Arial'>Can we run the MPI =
> programs on
> standalone linux system ?
> - face=3DArial> style=3D'font-size:10.0pt;font-family:Arial'>My requirement is to =
> run MPI
> programs on two different machines exchanging data across in the =
> network.
> Is there any document that gives info on how to go about this or =
> did any
> one did this earlier?
>
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>Rgrds
>
>
style=3D'font-size:10.0pt;
> font-family:Arial'>Vithal
>
>
>
>
>
>
>
> ------=_NextPart_000_0007_01C69C54.5AB316F0--
>
>
>
> --===============1353028203==
> Content-Type: text/plain; charset="us-ascii"
> MIME-Version: 1.0
> Content-Transfer-Encoding: 7bit
> Content-Disposition: inline
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss@cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
> --===============1353028203==--
>
>
From amsmith at lanl.gov Mon Jul 3 19:52:37 2006
From: amsmith at lanl.gov (Adam M. Smith)
Date: Mon Jul 3 19:53:05 2006
Subject: [mvapich-discuss] (no subject)
Message-ID: <55867.128.165.242.125.1151970757.squirrel@webmail.lanl.gov>
Hi All,
I'm trying to get ParaView working on a linux cluster. I have confirmed
that, for any session with np>1 all procs can't get past MPI_Init (and
ParaView seems to be passing the right args). [backtrace at end of
message]
In the MVAPICH User Guide, in the section about applications not passing
MPI_Init, I have covered some of the basic tests. That is...
- I'm using ssh with keys and have tested connecting from each node to
every other without entering a password
- I have confirmed that the hostnames supplied to mpirun_rsh match those
used in /etc/hosts on all machines
- I can run ibv_rc_pingpong and similar tests without any problem at all,
naming the hosts as I do trying to run ParaView with mpirun_rsh (whee! cpi
is fast)
(one of the tests mentioned for VAPI, perf_main I can't find)
What else can I try? What does this mean?
Importantly
- other programs we built here use mvapich over our network fine (though
I'm not extremely familiar with thier configuration)
- the other tools using mvapich seem to use the same shared libraries as
the one I built
(gdb) r
Starting program: /usr/local2/users/pugmire/paraview/bin/paraview
[Thread debugging using libthread_db enabled]
[New Thread 46912528646976 (LWP 10581)]
Program received signal SIGTERM, Terminated. <-- I did this!
[Switching to Thread 46912528646976 (LWP 10581)]
0x00002aaaab6b48eb in __lll_mutex_lock_wait () from
/lib64/tls/libpthread.so.0
(gdb) bt
#0 0x00002aaaab6b48eb in __lll_mutex_lock_wait ()
from /lib64/tls/libpthread.so.0
#1 0x00007ffffff07b08 in ?? ()
#2 0x0000000000001386 in ?? ()
#3 0x00002aaaab6b1877 in pthread_mutex_lock ()
from /lib64/tls/libpthread.so.0
#4 0x00002aaaac9b4d1e in __mthca_reg_mr (pd=0x0, addr=0x1, length=64669200,
hca_va=64669176, access=64673272) at verbs.c:127
#5 0x00002aaaac9b2631 in mthca_free_av (ah=0x3dab060) at ah.c:172
#6 0x00002aaaac9b4c09 in mthca_destroy_ah (ah=0x3dab140) at mthca.h:242
#7 0x0000000001ae97aa in viadev_post_recv ()
#8 0x0000000001aeed39 in init_mpi_connections ()
#9 0x0000000001aee007 in MPID_VIA_Init ()
#10 0x0000000001ae667b in MPID_Init ()
#11 0x0000000001ad2216 in MPIR_Init ()
#12 0x0000000001ad2119 in PMPI_Init ()
#13 0x0000000001397e30 in vtkPVMain::Initialize ()
#14 0x0000000000b836ba in MyMain ()
#15 0x0000000000b83797 in main ()
(gdb)
From surs at cse.ohio-state.edu Wed Jul 5 10:13:39 2006
From: surs at cse.ohio-state.edu (Sayantan Sur)
Date: Wed Jul 5 10:14:08 2006
Subject: [mvapich-discuss] (no subject)
In-Reply-To: <55867.128.165.242.125.1151970757.squirrel@webmail.lanl.gov>
References: <55867.128.165.242.125.1151970757.squirrel@webmail.lanl.gov>
Message-ID: <44ABC913.6050609@cse.ohio-state.edu>
Hi Adam,
Adam M. Smith wrote:
>Hi All,
>
>I'm trying to get ParaView working on a linux cluster. I have confirmed
>that, for any session with np>1 all procs can't get past MPI_Init (and
>ParaView seems to be passing the right args). [backtrace at end of
>message]
>
>
Thanks for trying out Paraview with MVAPICH!
>In the MVAPICH User Guide, in the section about applications not passing
>MPI_Init, I have covered some of the basic tests. That is...
>
>- I'm using ssh with keys and have tested connecting from each node to
>every other without entering a password
>- I have confirmed that the hostnames supplied to mpirun_rsh match those
>used in /etc/hosts on all machines
>- I can run ibv_rc_pingpong and similar tests without any problem at all,
>naming the hosts as I do trying to run ParaView with mpirun_rsh (whee! cpi
>is fast)
>
>(one of the tests mentioned for VAPI, perf_main I can't find)
>
>What else can I try? What does this mean?
>
>Importantly
>- other programs we built here use mvapich over our network fine (though
>I'm not extremely familiar with thier configuration)
>- the other tools using mvapich seem to use the same shared libraries as
>the one I built
>
>
Thanks for verifying the basic setup of MVAPICH and for going over the
user guide suggestions.
>#8 0x0000000001aeed39 in init_mpi_connections ()
>
>
Could you tell us which version of MVAPICH you are using on the linux
cluster? I ask this since this function was used in a preliminary
version of MVAPICH-gen2 which was older than MVAPICH-0.9.7 (and our
current MVAPICH-0.9.8). This function name is still used in our
`multi-rail' version of MVAPICH-0.9.7. A bit more crisply, there are two
questions:
1) If you are using a version older than MVAPICH-0.9.7, then could you
please download/install the MVAPICH-0.9.7 or MVAPICH-0.9.8-RC0 and
verify the problem still exists?
2) Could you confirm whether you are using the multi-rail version of
MVAPICH-0.9.7 or not.
Thanks,
Sayantan.
--
http://www.cse.ohio-state.edu/~surs
From chai.15 at osu.edu Wed Jul 5 11:09:22 2006
From: chai.15 at osu.edu (LEI CHAI)
Date: Wed Jul 5 11:09:50 2006
Subject: [mvapich-discuss] MVAPICH2 over uDAPL...
Message-ID: <38de42338e2c65.38e2c6538de423@osu.edu>
Hi,
Theoretically the path MPI2 -> uDAPL -> TCP/IP is correct but so far we are
not aware there is any uDAPL implementation over TCP/IP existing. Do you
have
such an implementation? BTW, FYI. MVAPICH2 can also be run on top of TCP/IP
directly.
Thanks.
Lei
----- Original Message -----
From: "vithaln"
To:
Cc:
Sent: Sunday, July 02, 2006 10:46 PM
Subject: RE: [mvapich-discuss] MVAPICH2 over uDAPL...
>> 2. My requirement is to run MPI programs on two different machines
>> exchanging data across in the network. Is there any document that gives
> info
>> on how to go about this or did any one did this earlier?
>
>>It depends on how you plan to exchange the data and what is the
>>underlying network and interface for these two different machines. You
>>will find all available options (networks and interfaces) available
>>options with MVAPICH2.
>
>
> I am assuming the following path MPI2 -> uDAPL -> TCP/IP on LAN.
> I have two linux systems connected via a HUB (on my corporate network).
> Wanted to run mpi-exec to initiate some kind of data transfer(probably
> some
> file transfers.. may be ftp?!) across the two systems. I wanted to study
> the
> performance impact on employing uDAPL over TCP (ideally RDMA over TCP) as
> the transport interface. Let me know if my understanding and the attempt
> are
> on the wrong foot!
>
>
> Some thing as given below
>
> ---------------------------
> | Application for File XFR|
> ---------------------------
> |
> -----------
> | Udapl |
> -----------
> |
> User Space |
> -----------------------------
> Kernel Space |
> |
> -----------
> | TCP |
> -----------
> |
> -----------------------------
> NIC H/w |
> HARDWARE
> |
> ---------------------------
> N/W |
> LAN
>
>
> I know that, RDMA over TCP brings in a number of issues pertaining to
> restricting TCP to meet RDMA requirements. But if uDAPL is successful, I
> believe those issues are overcome!
>
> Thanks and Rgrds
> Vithal
>
> -----Original Message-----
> From: Dhabaleswar Panda [mailto:panda@cse.ohio-state.edu]
> Sent: Friday, June 30, 2006 10:12 PM
> To: vithaln
> Cc: mvapich-discuss@cse.ohio-state.edu
> Subject: Re: [mvapich-discuss] MVAPICH2 over uDAPL...
>
> Hi,
>
> Thanks for your note.
>
>> I could not find any information in user guide regarding running mvapich2
> on
>> uDAPL. The guide I am using is mpich2-doc-user.pdf downloaded from
>> sourceforge.net.
>>
>> However, I could find the udapl script in MVAPICH2 directory. Could some
> one
>> point me to the write doc which details how to run mpi programs/tests on
>> uDAPL?
>
> It looks like you are looking for the MVAPICH2 user guide at the wrong
> place. The above-mentioned user guide is for MPICH2 not MVAPICH2.
>
> It is accessible from MVAPICH2 web page (either from the FAQ link or
> from the Download link). The URL is:
>
> http://nowlab.cse.ohio-state.edu/projects/mpi-iba/download-mvapich2/mvapich2
> _user_guide.html
>
> This user guide contains in-depth instructions on how to install and
> use MVAPICH2 with uDAPL.
>
>> Also, got few questions
>>
>> 1. Can we run the MPI programs on standalone linux system ?
>
> Sure.
>
>> 2. My requirement is to run MPI programs on two different machines
>> exchanging data across in the network. Is there any document that gives
> info
>> on how to go about this or did any one did this earlier?
>
> It depends on how you plan to exchange the data and what is the
> underlying network and interface for these two different machines. You
> will find all available options (networks and interfaces) available
> options with MVAPICH2.
>
> DK
>
>>
>>
>> Rgrds
>>
>> Vithal
>>
>>
>> ------=_NextPart_000_0007_01C69C54.5AB316F0
>> Content-Type: text/html;
>> charset="us-ascii"
>> Content-Transfer-Encoding: quoted-printable
>>
>> > xmlns:w=3D"urn:schemas-microsoft-com:office:word" =
>> xmlns=3D"http://www.w3.org/TR/REC-html40">
>>
>>
>> > charset=3Dus-ascii">
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
> style=3D'font-size:10.0pt;
>> font-family:Arial'>Hello,
>>
>>
> style=3D'font-size:10.0pt;
>> font-family:Arial'>
>>
>>
> style=3D'font-size:10.0pt;
>> font-family:Arial'>I could not find any information in user guide =
>> regarding
>> running mvapich2 on uDAPL. The guide I am using is mpich2-doc-user.pdf
>> downloaded from sourceforge.net.
>>
>>
> style=3D'font-size:10.0pt;
>> font-family:Arial'>However, I could find the udapl script in MVAPICH2
>> directory. Could some one point me to the write doc which details how to
>> =
>> run mpi
>> programs/tests on uDAPL?
>>
>>
> style=3D'font-size:10.0pt;
>> font-family:Arial'>
>>
>>
> style=3D'font-size:10.0pt;
>> font-family:Arial'>Also, got few questions
>>
>>
>> - > face=3DArial>> style=3D'font-size:10.0pt;font-family:Arial'>Can we run the MPI =
>> programs on
>> standalone linux system ?
>> - > face=3DArial>> style=3D'font-size:10.0pt;font-family:Arial'>My requirement is to =
>> run MPI
>> programs on two different machines exchanging data across in the =
>> network.
>> Is there any document that gives info on how to go about this or =
>> did any
>> one did this earlier?
>>
>>
>>
> style=3D'font-size:10.0pt;
>> font-family:Arial'>
>>
>>
> style=3D'font-size:10.0pt;
>> font-family:Arial'>Rgrds
>>
>>
> style=3D'font-size:10.0pt;
>> font-family:Arial'>Vithal
>>
>>
>>
>>
>>
>>
>>
>> ------=_NextPart_000_0007_01C69C54.5AB316F0--
>>
>>
>>
>> --===============1353028203==
>> Content-Type: text/plain; charset="us-ascii"
>> MIME-Version: 1.0
>> Content-Transfer-Encoding: 7bit
>> Content-Disposition: inline
>>
>> _______________________________________________
>> mvapich-discuss mailing list
>> mvapich-discuss@cse.ohio-state.edu
>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>> --===============1353028203==--
>>
>>
>
>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss@mail.cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
From pfeifer at rz.rwth-aachen.de Thu Jul 6 04:36:17 2006
From: pfeifer at rz.rwth-aachen.de (Matthias Pfeifer)
Date: Thu Jul 6 04:36:37 2006
Subject: [mvapich-discuss] DAPL_PROVIDER
Message-ID: <339958126@web.de>
Hello list,
i recently asked for the possibilty to provide various DAPL_PROVIDER to a set of processes. Is this at all possible? Dhabaleswar and Lei do you need any more information? I didnt considered my question to be platform-specific... Is it? Please correct me if i am wrong. Here is a more detailed description of our current setup.
mvapich2-trunk-2006-06-29, 3 SunFire V40Z with Solaris 10 and 1 SunFire V40Z with OpenSolaris Build 34 (Community Version), all 4 connected via 4x Infiniband Adapter (mellanox if i am not mistaken, but not sure...). The three Solaris 10 boxes have ibd0 as DAPL_PROVIDER the OpenSolaris box ibd2. Whenever the OpenSolaris Box is involved i get
[rdma_udapl_priv.c:648] error(-2146828288): Cannot open IA
Matthias
From vishnu at cse.ohio-state.edu Thu Jul 6 09:28:11 2006
From: vishnu at cse.ohio-state.edu (Abhinav Vishnu)
Date: Thu Jul 6 09:28:22 2006
Subject: [mvapich-discuss] DAPL_PROVIDER
In-Reply-To: <339958126@web.de>
Message-ID:
Hi Matthias,
Thanks for your interest in MVAPICH and reporting this issue.
At this point, MVAPICH assumes the presence of a uniform DAPL_PROVIDER
for all participating processes. To solve your problem, we are working on
a solution that would allow the presence of different DAPL_PROVIDER values
for different processes. We will keep the group posted on this front.
Thanks a lot again.
regards,
-- Abhinav
-------------------------------
Abhinav Vishnu,
Graduate Research Associate,
Department Of Comp. Sc. & Engg.
The Ohio State University.
-------------------------------
On Thu, 6 Jul 2006, Matthias Pfeifer wrote:
> Hello list,
>
> i recently asked for the possibilty to provide various DAPL_PROVIDER to a set of processes. Is this at all possible? Dhabaleswar and Lei do you need any more information? I didnt considered my question to be platform-specific... Is it? Please correct me if i am wrong. Here is a more detailed description of our current setup.
> mvapich2-trunk-2006-06-29, 3 SunFire V40Z with Solaris 10 and 1 SunFire V40Z with OpenSolaris Build 34 (Community Version), all 4 connected via 4x Infiniband Adapter (mellanox if i am not mistaken, but not sure...). The three Solaris 10 boxes have ibd0 as DAPL_PROVIDER the OpenSolaris box ibd2. Whenever the OpenSolaris Box is involved i get
>
> [rdma_udapl_priv.c:648] error(-2146828288): Cannot open IA
>
> Matthias
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss@mail.cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
From pfeifer at rz.rwth-aachen.de Thu Jul 20 08:16:05 2006
From: pfeifer at rz.rwth-aachen.de (Matthias Pfeifer)
Date: Thu Jul 20 08:16:37 2006
Subject: [mvapich-discuss] Performance of udapl on Solarix/x86
Message-ID: <347610561@web.de>
Hello list,
i am getting 500 Mb/s measured with the Pallas MPI Benchmark using a current version of mvapich2 (mvapich2-0.9.3-2006-07-19) from the branches directory. I am compiling and using the library on an amd64 based solaris 10 system. Problem is the performance. Is there any need for tuning in order to get more bytes through our 4x-Infiniband HCA's?
The library is compiled 64-bit with the "sun studio" compilers.
Matthias Pfeifer
From chai.15 at osu.edu Thu Jul 20 17:09:14 2006
From: chai.15 at osu.edu (LEI CHAI)
Date: Thu Jul 20 17:09:29 2006
Subject: [mvapich-discuss] Performance of udapl on Solarix/x86
Message-ID: <12e9aa12b5f1.12b5f112e9aa@osu.edu>
Hi Matthias,
In order to have more insights to your issue, could you let us know the system configuration? Especially the InfiniBand setup, such as card type, firmware version, and bus (PCI-X or PCI-Express) etc.
Thanks.
Lei
----- Original Message -----
From: Matthias Pfeifer
Date: Thursday, July 20, 2006 7:16 am
Subject: [mvapich-discuss] Performance of udapl on Solarix/x86
> Hello list,
>
> i am getting 500 Mb/s measured with the Pallas MPI Benchmark using
> a current version of mvapich2 (mvapich2-0.9.3-2006-07-19) from the
> branches directory. I am compiling and using the library on an
> amd64 based solaris 10 system. Problem is the performance. Is there
> any need for tuning in order to get more bytes through our 4x-
> Infiniband HCA's?
>
> The library is compiled 64-bit with the "sun studio" compilers.
>
> Matthias Pfeifer
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss@mail.cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
From mike.heffner at evergrid.com Thu Jul 20 17:12:35 2006
From: mike.heffner at evergrid.com (Mike Heffner)
Date: Thu Jul 20 17:12:52 2006
Subject: [mvapich-discuss] mvapich layer breaking
Message-ID: <44BFF1C3.8020304@evergrid.com>
Hi,
I have been reading through the mvapich code and I have found that
mvapich breaks the ADI abstraction provided by the MPID API. In the file
"src/context/comm_util.c" in function MPIR_Comm_make_coll() there are
direct calls to device-specific functions such as comm_rdma_init() and
comm_exch_addr().
What were the technical reasons behind placing these initialization
calls at the MPIR layer rather than overloading the MPID_CommInit()
function to achieve similar results?
Cheers,
Mike
--
Mike Heffner
EverGrid Software
Blacksburg, VA USA
Voice: (540) 443-3500 #603
From mamidala at cse.ohio-state.edu Fri Jul 21 09:16:05 2006
From: mamidala at cse.ohio-state.edu (amith rajith mamidala)
Date: Fri Jul 21 09:16:19 2006
Subject: [mvapich-discuss] mvapich layer breaking
In-Reply-To: <44BFF1C3.8020304@evergrid.com>
Message-ID:
Hi Mike,
We have done this for supporting RDMA-based collective operations
and the associated address exchange to deliver the best
peformance,
Thanks,
Amith
On Thu, 20 Jul 2006, Mike Heffner wrote:
> Hi,
>
> I have been reading through the mvapich code and I have found that
> mvapich breaks the ADI abstraction provided by the MPID API. In the file
> "src/context/comm_util.c" in function MPIR_Comm_make_coll() there are
> direct calls to device-specific functions such as comm_rdma_init() and
> comm_exch_addr().
>
> What were the technical reasons behind placing these initialization
> calls at the MPIR layer rather than overloading the MPID_CommInit()
> function to achieve similar results?
>
>
> Cheers,
>
> Mike
>
> --
>
> Mike Heffner
> EverGrid Software
> Blacksburg, VA USA
>
> Voice: (540) 443-3500 #603
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss@mail.cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
From andrey.slepuhin at t-platforms.ru Mon Jul 24 15:42:39 2006
From: andrey.slepuhin at t-platforms.ru (Andrey Slepuhin)
Date: Mon Jul 24 15:42:56 2006
Subject: [mvapich-discuss] Latest mvapich problems with PCI-Expess
InfiniPath
Message-ID: <44C522AF.8090806@t-platforms.ru>
Dear folks,
I'm doing some testing on Woodcrest-based cluster with QLogic
(PathScale) PCI-Express InfiniPath adapters. I got slightly better
performance with latest MVAPICH (0.9.8-rc0) vs. native MPI on Linpack
but unfortunately I see failed residual checks periodically. Has anybody
seen such problems?
The hardware is dual Woodcrest 3.0GHz, 8GB RAM, InfiniPath PE-880 HCA
The software: RHEL 4 update 3, kernel 2.6.9-34.ELsmp, InfiniPath
software stack 1.3 (including OpenFabrics software), MVAPICH-0.9.8-rc0
Best regards,
Andrey
--
A right thing should be simple (tm)
From surs at cse.ohio-state.edu Mon Jul 24 17:55:54 2006
From: surs at cse.ohio-state.edu (Sayantan Sur)
Date: Mon Jul 24 17:56:13 2006
Subject: [mvapich-discuss] Latest mvapich problems with
PCI-Expess InfiniPath
In-Reply-To: <44C522AF.8090806@t-platforms.ru>
References: <44C522AF.8090806@t-platforms.ru>
Message-ID: <44C541EA.2080801@cse.ohio-state.edu>
Hello Andrey,
Since the announcement of MVAPICH-0.9.8-rc0, we have made several
changes to our codebase. Today, we have made a MVAPICH-0.9.8-rc1 tarball
(and SVN tag) available. We'd be glad if you gave this new release
candidate a shot to see if your problems go away.
Thanks,
Sayantan.
Andrey Slepuhin wrote:
> Dear folks,
>
> I'm doing some testing on Woodcrest-based cluster with QLogic
> (PathScale) PCI-Express InfiniPath adapters. I got slightly better
> performance with latest MVAPICH (0.9.8-rc0) vs. native MPI on Linpack
> but unfortunately I see failed residual checks periodically. Has
> anybody seen such problems?
>
> The hardware is dual Woodcrest 3.0GHz, 8GB RAM, InfiniPath PE-880 HCA
> The software: RHEL 4 update 3, kernel 2.6.9-34.ELsmp, InfiniPath
> software stack 1.3 (including OpenFabrics software), MVAPICH-0.9.8-rc0
>
> Best regards,
> Andrey
>
--
http://www.cse.ohio-state.edu/~surs
From keenandr at msu.edu Tue Jul 25 17:37:16 2006
From: keenandr at msu.edu (Andrew Keen)
Date: Tue Jul 25 22:35:27 2006
Subject: [mvapich-discuss] Problem compiling 0.9.7 / 0.9.8-rc1 (pi3f90)
Message-ID: <44C68F0C.3090605@msu.edu>
Hi,
I'm trying to compile MVAPICH with VAPI support with the Pathscale
compilers (using make.mvapich.vapi from the 0.9.8-rc1 tarball). The
build process completes successfully, but when the install section of
the script is reached, the pi3f90 compilation fails with:
/opt/hpc/mvapich/0.9.8-rc1/bin/mpif90 -o pi3f90 pi3f90.o
/opt/hpc/mvapich/0.9.8-rc1/lib/libmpichf90.a(mpi2__ch_s.o)(.text+0x32):
In function `MPI_FILE_IREAD_T.in.MPI2__CHARACTER_S':
: undefined reference to `mpi_file_iread__'
/opt/hpc/mvapich/0.9.8-rc1/lib/libmpichf90.a(mpi2__ch_s.o)(.text+0x69):
In function `MPI_FILE_IREAD_AT_T.in.MPI2__CHARACTER_S':
: undefined reference to `mpi_file_iread_at__'
(etc.)
I've tried using 0.9.7 and 0.9.8-rc1 and pathscale 2.3 and 2.4 . I've
attached the compile/make/install logs. Can someone point me in the
right direction?
Thanks,
-Andy
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 060725-buildlogs.tgz
Type: application/octet-stream
Size: 24924 bytes
Desc: not available
Url : http://mail.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20060725/98bc7fa7/060725-buildlogs-0001.obj
From rowland at cse.ohio-state.edu Wed Jul 26 00:32:44 2006
From: rowland at cse.ohio-state.edu (Shaun Rowland)
Date: Wed Jul 26 00:33:03 2006
Subject: [mvapich-discuss] Problem compiling 0.9.7 / 0.9.8-rc1 (pi3f90)
In-Reply-To: <44C68F0C.3090605@msu.edu>
References: <44C68F0C.3090605@msu.edu>
Message-ID: <44C6F06C.6040500@cse.ohio-state.edu>
Andrew Keen wrote:
> I've tried using 0.9.7 and 0.9.8-rc1 and pathscale 2.3 and 2.4 . I've
> attached the compile/make/install logs. Can someone point me in the
> right direction?
Hi. Could you please try changing the configure option "--without-romio"
to "--with-romio" in the make script? The configure help output states:
If romio is not included, the Fortran 90 modules cannot be built.
It also might be helpful in this case if you set F77 and F90 to pathf90.
It appears as if configure was choosing pathf95 as F77 and pathf90 as F90.
--
Shaun Rowland rowland@cse.ohio-state.edu
http://www.cse.ohio-state.edu/~rowland/
From sram at profc.udec.cl Wed Jul 26 10:42:22 2006
From: sram at profc.udec.cl (Salvador Ramirez)
Date: Wed Jul 26 10:42:10 2006
Subject: [mvapich-discuss] Problem with mvapich2 on a cluster connected with
GigE and IB
Message-ID: <44C77F4E.9030803@profc.udec.cl>
Hello,
I recently downloaded and installed mvapich2 on a
cluster that has two connections among the nodes: gigabit
ethernet and infiniband. Each node has then two ip addresses
(one for each connection of course) related to obvious names
like n1 and n1-ib, n2 and n2-ib, et-cetera.
For the compilation I selected VAPI and everything
compiled without problems, so the successful installation
was on /usr/local/mvapich2. Then I created the file hostfile
like this:
n1-ib
n2-ib
...
and then ran the mpdboot -n 8 -f hostfile. Everything
fine until here but then when I checked with mpdtrace -l I
see that the nodes are n1, n2, n3... with the IP address of
the gigE network. So I wonder why mpd choose this address
when in the hostfile the names are explicitly listed as
their corresponding IB address??
Of course this has further problems since when I try to
run a mpi program with mpiexec I received error message from
the vapi library since the address are not over IB.
Any help is very appreciated. Thanks.
---sram
From keenandr at msu.edu Wed Jul 26 13:39:58 2006
From: keenandr at msu.edu (Andrew Keen)
Date: Wed Jul 26 13:40:16 2006
Subject: [mvapich-discuss] Problem compiling 0.9.7 / 0.9.8-rc1 (pi3f90)
In-Reply-To: <44C6F06C.6040500@cse.ohio-state.edu>
References: <44C68F0C.3090605@msu.edu> <44C6F06C.6040500@cse.ohio-state.edu>
Message-ID: <44C7A8EE.3040209@msu.edu>
Shaun,
> Hi. Could you please try changing the configure option "--without-romio"
> to "--with-romio" in the make script? The configure help output states:
This solved the problem, thank you. I don't know how I missed that.
-Andy
From koop at cse.ohio-state.edu Wed Jul 26 15:04:55 2006
From: koop at cse.ohio-state.edu (Matthew Koop)
Date: Wed Jul 26 15:05:10 2006
Subject: [mvapich-discuss] Problem with mvapich2 on a cluster connected
with GigE and IB
In-Reply-To: <44C77F4E.9030803@profc.udec.cl>
Message-ID:
Salvador,
When running mpdtrace, it will report the hostname of the machine, not the
hostname associated with the IP address MPD is listening on. On our
systems here, I need to run:
mpdboot -n 2 -f hosts --ifhn=d2ib
(where d2ib is a hostname that resolves to the IPoIB interface of the
machine I am running mpdboot on, which is also in the hosts file) You may
not need this parameter. You can verify that things are running over IPoIB
by using the following command on n2 before and after running mpdboot:
netstat -a | grep n1-ib
It is very important to note that changing the interface is not likely to
change performance (or solve your startup problem). MVAPICH2 only uses the
specified interface to exchange a few startup parameters over IP. After
exchanging enough information to startup native IB connections, all
further communication will go over the native IB layer -- not the IPoIB
interface.
What is the problem you are experiencing on startup? That will allow us to
better debug the problem.
Thanks,
Matthew Koop
-
Network-Based Computing Lab
Ohio-State University
On Wed, 26 Jul 2006, Salvador Ramirez wrote:
> Hello,
>
> I recently downloaded and installed mvapich2 on a
> cluster that has two connections among the nodes: gigabit
> ethernet and infiniband. Each node has then two ip addresses
> (one for each connection of course) related to obvious names
> like n1 and n1-ib, n2 and n2-ib, et-cetera.
>
> For the compilation I selected VAPI and everything
> compiled without problems, so the successful installation
> was on /usr/local/mvapich2. Then I created the file hostfile
> like this:
>
> n1-ib
> n2-ib
> ...
>
> and then ran the mpdboot -n 8 -f hostfile. Everything
> fine until here but then when I checked with mpdtrace -l I
> see that the nodes are n1, n2, n3... with the IP address of
> the gigE network. So I wonder why mpd choose this address
> when in the hostfile the names are explicitly listed as
> their corresponding IB address??
>
> Of course this has further problems since when I try to
> run a mpi program with mpiexec I received error message from
> the vapi library since the address are not over IB.
>
> Any help is very appreciated. Thanks.
>
> ---sram
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss@mail.cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>
From sram at profc.udec.cl Wed Jul 26 16:23:17 2006
From: sram at profc.udec.cl (Salvador Ramirez)
Date: Wed Jul 26 16:23:02 2006
Subject: [mvapich-discuss] Problem with mvapich2 on a cluster connected
with GigE and IB
In-Reply-To:
References:
Message-ID: <44C7CF35.9070701@profc.udec.cl>
Matthew,
Thanks for your answer. Here is the output when I run the
"cpi" program example that comes with the distribution:
---------------------
~/mvapich2-0.9.3/examples> mpdtrace -l
newen_2574 (192.168.0.250) <--- GigE IP addressess
n2_32945 (192.168.0.2)
n3_32932 (192.168.0.3)
n4_32923 (192.168.0.4)
n1_33014 (192.168.0.1)
n7_32909 (192.168.0.7)
n8_32909 (192.168.0.8)
n6_32911 (192.168.0.6)
~/mvapich2-0.9.3/examples> mpiexec -n 8 ./cpi
sched_setaffinity: Bad address
sched_setaffinity: Bad address
sched_setaffinity: Bad address
sched_setaffinity: Bad address
sched_setaffinity: Bad address
sched_setaffinity: Bad address
sched_setaffinity: Bad address
sched_setaffinity: Bad address
Process 0 of 8 is on newen
Process 4 of 8 is on n1
Process 1 of 8 is on n2
Process 3 of 8 is on n4
Process 2 of 8 is on n3
Process 6 of 8 is on n8
Process 5 of 8 is on n7
Process 7 of 8 is on n6
pi is approximately 3.1415926544231247, Error is
0.0000000008333316
wall clock time = 0.111762
----------------------
After what you said I think I should just ignore the
first messages and realize that mvapich is actually working
on all of the nodes, right? anyway what means that error
messages? I've googled it but I just found only discussions
about kernel development.
Thanks again.
Best regards,
---sram
Matthew Koop wrote:
> Salvador,
>
> When running mpdtrace, it will report the hostname of the machine, not the
> hostname associated with the IP address MPD is listening on. On our
> systems here, I need to run:
>
> mpdboot -n 2 -f hosts --ifhn=d2ib
>
> (where d2ib is a hostname that resolves to the IPoIB interface of the
> machine I am running mpdboot on, which is also in the hosts file) You may
> not need this parameter. You can verify that things are running over IPoIB
> by using the following command on n2 before and after running mpdboot:
>
> netstat -a | grep n1-ib
>
> It is very important to note that changing the interface is not likely to
> change performance (or solve your startup problem). MVAPICH2 only uses the
> specified interface to exchange a few startup parameters over IP. After
> exchanging enough information to startup native IB connections, all
> further communication will go over the native IB layer -- not the IPoIB
> interface.
>
> What is the problem you are experiencing on startup? That will allow us to
> better debug the problem.
>
> Thanks,
>
> Matthew Koop
> -
> Network-Based Computing Lab
> Ohio-State University
>
>
>
> On Wed, 26 Jul 2006, Salvador Ramirez wrote:
>
>
>>Hello,
>>
>> I recently downloaded and installed mvapich2 on a
>>cluster that has two connections among the nodes: gigabit
>>ethernet and infiniband. Each node has then two ip addresses
>>(one for each connection of course) related to obvious names
>>like n1 and n1-ib, n2 and n2-ib, et-cetera.
>>
>> For the compilation I selected VAPI and everything
>>compiled without problems, so the successful installation
>>was on /usr/local/mvapich2. Then I created the file hostfile
>>like this:
>>
>>n1-ib
>>n2-ib
>>...
>>
>> and then ran the mpdboot -n 8 -f hostfile. Everything
>>fine until here but then when I checked with mpdtrace -l I
>>see that the nodes are n1, n2, n3... with the IP address of
>>the gigE network. So I wonder why mpd choose this address
>>when in the hostfile the names are explicitly listed as
>>their corresponding IB address??
>>
>> Of course this has further problems since when I try to
>>run a mpi program with mpiexec I received error message from
>>the vapi library since the address are not over IB.
>>
>>Any help is very appreciated. Thanks.
>>
>>---sram
>>
>>_______________________________________________
>>mvapich-discuss mailing list
>>mvapich-discuss@mail.cse.ohio-state.edu
>>http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>
>
>
>
From koop at cse.ohio-state.edu Wed Jul 26 17:03:04 2006
From: koop at cse.ohio-state.edu (Matthew Koop)
Date: Wed Jul 26 17:03:17 2006
Subject: [mvapich-discuss] Problem with mvapich2 on a cluster connected
with GigE and IB
In-Reply-To: <44C7CF35.9070701@profc.udec.cl>
Message-ID:
Salvador,
Sorry, I left out one other detail. In the hosts file after the hostname,
place ifhn=. e.g.
n1 ifhn=n1-ib
n2 ifhn=n2-ib
The communication is definately running over IB after the program starts
up. If you want to convince yourself that it is using the IB fabric you
can compile and run the osu_bw or osu_latency test in the osu_benchmarks
directory of the distribution. You can compare your results with ones
posted on our webpage --
http://nowlab.cse.ohio-state.edu/projects/mpi-iba/ (under "Performance")
As for the error you are seeing, "sched_setaffinity: Bad address", can you
give us a little more information about your setup -- such as the kernel
version, architecture, etc? It would be especially helpful if you
could send the config-mine.log and make-mine.log. They should be in the
main directory where you compiled MVAPICH2. [ You can the logs directly to
my address to avoid having large files sent to the whole list. ]
Thanks,
Matthew Koop
-
Network-Based Computing Lab
Ohio State University
On Wed, 26 Jul 2006, Salvador Ramirez wrote:
> Matthew,
>
> Thanks for your answer. Here is the output when I run the
> "cpi" program example that comes with the distribution:
>
> ---------------------
> ~/mvapich2-0.9.3/examples> mpdtrace -l
> newen_2574 (192.168.0.250) <--- GigE IP addressess
> n2_32945 (192.168.0.2)
> n3_32932 (192.168.0.3)
> n4_32923 (192.168.0.4)
> n1_33014 (192.168.0.1)
> n7_32909 (192.168.0.7)
> n8_32909 (192.168.0.8)
> n6_32911 (192.168.0.6)
>
> ~/mvapich2-0.9.3/examples> mpiexec -n 8 ./cpi
> sched_setaffinity: Bad address
> sched_setaffinity: Bad address
> sched_setaffinity: Bad address
> sched_setaffinity: Bad address
> sched_setaffinity: Bad address
> sched_setaffinity: Bad address
> sched_setaffinity: Bad address
> sched_setaffinity: Bad address
> Process 0 of 8 is on newen
> Process 4 of 8 is on n1
> Process 1 of 8 is on n2
> Process 3 of 8 is on n4
> Process 2 of 8 is on n3
> Process 6 of 8 is on n8
> Process 5 of 8 is on n7
> Process 7 of 8 is on n6
> pi is approximately 3.1415926544231247, Error is
> 0.0000000008333316
> wall clock time = 0.111762
> ----------------------
>
> After what you said I think I should just ignore the
> first messages and realize that mvapich is actually working
> on all of the nodes, right? anyway what means that error
> messages? I've googled it but I just found only discussions
> about kernel development.
>
> Thanks again.
> Best regards,
>
> ---sram
>
> Matthew Koop wrote:
> > Salvador,
> >
> > When running mpdtrace, it will report the hostname of the machine, not the
> > hostname associated with the IP address MPD is listening on. On our
> > systems here, I need to run:
> >
> > mpdboot -n 2 -f hosts --ifhn=d2ib
> >
> > (where d2ib is a hostname that resolves to the IPoIB interface of the
> > machine I am running mpdboot on, which is also in the hosts file) You may
> > not need this parameter. You can verify that things are running over IPoIB
> > by using the following command on n2 before and after running mpdboot:
> >
> > netstat -a | grep n1-ib
> >
> > It is very important to note that changing the interface is not likely to
> > change performance (or solve your startup problem). MVAPICH2 only uses the
> > specified interface to exchange a few startup parameters over IP. After
> > exchanging enough information to startup native IB connections, all
> > further communication will go over the native IB layer -- not the IPoIB
> > interface.
> >
> > What is the problem you are experiencing on startup? That will allow us to
> > better debug the problem.
> >
> > Thanks,
> >
> > Matthew Koop
> > -
> > Network-Based Computing Lab
> > Ohio-State University
> >
> >
> >
> > On Wed, 26 Jul 2006, Salvador Ramirez wrote:
> >
> >
> >>Hello,
> >>
> >> I recently downloaded and installed mvapich2 on a
> >>cluster that has two connections among the nodes: gigabit
> >>ethernet and infiniband. Each node has then two ip addresses
> >>(one for each connection of course) related to obvious names
> >>like n1 and n1-ib, n2 and n2-ib, et-cetera.
> >>
> >> For the compilation I selected VAPI and everything
> >>compiled without problems, so the successful installation
> >>was on /usr/local/mvapich2. Then I created the file hostfile
> >>like this:
> >>
> >>n1-ib
> >>n2-ib
> >>...
> >>
> >> and then ran the mpdboot -n 8 -f hostfile. Everything
> >>fine until here but then when I checked with mpdtrace -l I
> >>see that the nodes are n1, n2, n3... with the IP address of
> >>the gigE network. So I wonder why mpd choose this address
> >>when in the hostfile the names are explicitly listed as
> >>their corresponding IB address??
> >>
> >> Of course this has further problems since when I try to
> >>run a mpi program with mpiexec I received error message from
> >>the vapi library since the address are not over IB.
> >>
> >>Any help is very appreciated. Thanks.
> >>
> >>---sram
> >>
> >>_______________________________________________
> >>mvapich-discuss mailing list
> >>mvapich-discuss@mail.cse.ohio-state.edu
> >>http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
> >>
> >
> >
> >
>
From panda at cse.ohio-state.edu Sun Jul 30 23:08:34 2006
From: panda at cse.ohio-state.edu (Dhabaleswar Panda)
Date: Sun Jul 30 23:08:50 2006
Subject: [mvapich-discuss] Announcing the release of MVAPICH 0.9.8 with
on-demand connection management,
fault-tolerance and advanced multi-rail scheduling support
Message-ID: <200607310308.k6V38YpW007337@xi.cse.ohio-state.edu>
The MVAPICH team is pleased to announce the availability of MVAPICH
0.9.8 with the following new features:
- On-demand connection management using native InfiniBand
Unreliable Datagram (UD) support. This feature enables InfiniBand
connections to be setup dynamically and has `near constant'
memory usage with increasing number of processes.
This feature together with the Shared Receive Queue (SRQ) feature
(available since MVAPICH 0.9.7) enhances the scalability
of MVAPICH on multi-thousand node clusters.
Performance of applications and memory scalability using on-demand
connection management and SRQ support can be seen by visiting
the following URL:
http://nowlab.cse.ohio-state.edu/projects/mpi-iba/perf-apps.html
- Support for Fault Tolerance: Mem-to-mem reliable data transfer
(detection of I/O bus error with 32bit CRC and retransmission in
case of error). This mode enables MVAPICH to deliver messages
reliably in presence of I/O bus errors.
- Multi-rail communication support with flexible scheduling policies:
- Separate control of small and large message scheduling
- Three different scheduling policies for small messages:
- Using First Subchannel, Round Robin and Process Binding
- Six different scheduling policies for large messages:
- Round Robin, Weighted striping, Even striping,
Stripe Blocking, Adaptive Striping and Process Binding
- Shared library support for Solaris
- Integrated and easy-to-use build script which automatically
detects system architecture and InfiniBand adapter types and
optimizes MVAPICH for any particular installation
More details on all features and supported platforms can be obtained
by visiting the following URL:
http://nowlab.cse.ohio-state.edu/projects/mpi-iba/mvapich_features.html
MVAPICH 0.9.8 continues to deliver excellent performance. Sample
performance numbers include:
- OpenIB/Gen2 on EM64T with PCI-Ex and IBA-DDR:
- 2.93 microsec one-way latency (4 bytes)
- 1471 MB/sec unidirectional bandwidth
- 2678 MB/sec bidirectional bandwidth
- OpenIB/Gen2 on EM64T with PCI-Ex and IBA-DDR (dual-rail):
- 2534 MB/sec unidirectional bandwidth
- 3003 MB/sec bidirectional bandwidth
- OpenIB/Gen2 on Opteron with PCI-Ex and IBA-DDR:
- 2.65 microsec one-way latency (4 bytes)
- 1399 MB/sec unidirectional bandwidth
- 2253 MB/sec bidirectional bandwidth
- Solaris uDAPL/IBTL on Opteron with PCI-Ex and IBA-SDR:
- 3.86 microsec one-way latency (4 bytes)
- 981 MB/sec unidirectional bandwidth
- 1856 MB/sec bidirectional bandwidth
- OpenIB/Gen2 uDAPL on EM64T with PCI-Ex and IBA-SDR:
- 3.80 microsec one-way latency (4 bytes)
- 963 MB/sec unidirectional bandwidth
- 1851 MB/sec bidirectional bandwidth
- OpenIB/Gen2 uDAPL on Opteron with PCI-Ex and IBA-DDR:
- 2.81 microsec one-way latency (4 bytes)
- 1411 MB/sec unidirectional bandwidth
- 2252 MB/sec bidirectional bandwidth
Performance numbers for all other platforms, system configurations and
operations can be viewed by visiting `Performance' section of the
project's web page.
For downloading MVAPICH 0.9.8 package and accessing the anonymous SVN,
please visit the following URL:
http://nowlab.cse.ohio-state.edu/projects/mpi-iba/
A stripped down version of this release is also available at the
OpenIB SVN.
All feedbacks, including bug reports, hints for performance tuning,
patches and enhancements are welcome. Please post it to
mvapich-discuss mailing list.
Thanks,
MVAPICH Team at OSU/NBCL
======================================================================
MVAPICH/MVAPICH2 project is currently supported with funding from
U.S. National Science Foundation, U.S. DOE Office of Science,
Mellanox, Intel, Cisco Systems, Sun Microsystems and Linux Networx;
and with equipment support from Advanced Clustering, AMD, Apple,
Appro, Dell, IBM, Intel, Mellanox, Microway, PathScale, SilverStorm
and Sun Microsystems. Other technology partner includes Etnus.
======================================================================
From panda at cse.ohio-state.edu Sun Jul 30 23:38:21 2006
From: panda at cse.ohio-state.edu (Dhabaleswar Panda)
Date: Sun Jul 30 23:38:37 2006
Subject: [mvapich-discuss] mvapich job startup unreliable with slurm and
--cpu_bind (patch)
Message-ID: <200607310338.k6V3cLMZ007574@xi.cse.ohio-state.edu>
Hi Greg and Mike,
Many thanks for sending us the patch related to Slurm and --cpu_bind
on July 26th.
You had sent this note to mvapich@cse. Since `mvapich@cse' is an
announcement list only, it got blocked and I just noticed your posting
now.
I am forwarding this note to mvapich-discuss@cse.ohio-state.edu.
As you might have noticed, we just made the release of mvapich 0.9.8.
We will review your patch and incorporate it to the trunk and
0.9.8-branch soon.
May I request to post your future patches to
mvapich-discuss@cse.ohio-state.edu. Best Regards,
DK
----------------------------------------------------------------
The following patch seems to fix a problem starting mvapich jobs with
slurm and the --cpu_bind option. Under these conditions, some of the
MPI processes do not make it out of MPI_Init() and the job hangs on
launch. We think that this is because with slurm and --cpu_bind the
startup is more synchronized.
Thanks,
Greg Johnson & Mike Lang
diff -ur mvapich-0.9.8-rc0.orig/src/context/comm_rdma_init.c mvapich-0.9.8-rc0/src/context/comm_rdma_init.c
--- mvapich-0.9.8-rc0.orig/src/context/comm_rdma_init.c 2006-07-11 16:49:44.000000000 -0600
+++ mvapich-0.9.8-rc0/src/context/comm_rdma_init.c 2006-07-11 15:35:46.000000000 -0600
@@ -162,6 +162,7 @@
{
#ifndef CH_GEN2_MRAIL
int i = 0;
+ int right, left;
struct Coll_Addr_Exch send_pkt;
struct Coll_Addr_Exch *recv_pkt;
@@ -188,19 +189,17 @@
#else
send_pkt.buf_hndl = comm->collbuf->l_coll->buf_hndl;
#endif
-
- for(i = 0; i < comm->np; i++) {
- /* Don't send to myself */
- if(i == comm->local_rank) continue;
-
+ right=(comm->local_rank + 1)%comm->np;
+ left=(comm->local_rank + comm->np - 1)%comm->np;
+ for(i=0; i < comm->np-1; i++) {
MPI_Sendrecv((void*)&send_pkt, sizeof(struct Coll_Addr_Exch),
- MPI_BYTE, comm->lrank_to_grank[i], ADDR_EXCHANGE_TAG,
- (void*)&(recv_pkt[i]),sizeof(struct Coll_Addr_Exch),
- MPI_BYTE, comm->lrank_to_grank[i], ADDR_EXCHANGE_TAG,
+ MPI_BYTE, comm->lrank_to_grank[right], ADDR_EXCHANGE_TAG,
+ (void*)&(recv_pkt[left]),sizeof(struct Coll_Addr_Exch),
+ MPI_BYTE, comm->lrank_to_grank[left], ADDR_EXCHANGE_TAG,
MPI_COMM_WORLD, &(statarray[i]));
- if (statarray[i].MPI_ERROR != MPI_SUCCESS) {
- fprintf(stderr, "blah! %d %d\n", comm->local_rank, statarray[i].MPI_ERROR);
- }
+
+ right = (right+1)%comm->np;
+ left = (left + comm->np - 1)%comm->np;
}
for(i = 0; i < comm->np; i++) {