[mvapich-discuss] mvapich2 1.7rc1 and 2GB transfers

Devendar Bureddy bureddy at cse.ohio-state.edu
Wed Sep 7 09:22:01 EDT 2011


Hi Tibor

I just tried it on a fresh MVAPICH2-1.7rc1 download and I haven't seen
any issue here.   Are you trying it on a fresh RC1 tarball?   I will
follow-up you in another mail and if required I will send you patched
tarball.


[bureddy at head Merge]$ tar xf mvapich2-1.7rc1.tgz
[bureddy at head Merge]$ cd mvapich2-1.7rc1
[bureddy at head mvapich2-1.7rc1]$ patch -p0 < diff.txt
patching file src/mpid/ch3/include/mpidpkt.h
patching file src/mpid/ch3/src/ch3u_rndv.c
patching file src/mpid/ch3/channels/mrail/src/rdma/ch3_rndvtransfer.c
patching file src/mpid/ch3/channels/mrail/src/gen2/ibv_recv.c
patching file src/mpid/ch3/channels/mrail/src/gen2/mpidi_ch3_rdma_post.h
patching file src/mpid/ch3/channels/mrail/src/gen2/ibv_send.c

-Devendar
On Wed, Sep 7, 2011 at 5:56 AM, Tibor Pausz
<pausz at th.physik.uni-frankfurt.de> wrote:
> Hi Devendar,
>
> this patch also fails. Maybe it es easier, if you send me the three
> files instead?
>
> Best regards,
> Tibor
>
> patching file mpidpkt.h
> Hunk #1 FAILED at 234.
> 1 out of 1 hunk FAILED -- saving rejects to file mpidpkt.h.rej
> patching file ch3u_rndv.c
> Hunk #1 FAILED at 348.
> 1 out of 1 hunk FAILED -- saving rejects to file ch3u_rndv.c.rej
> patching file ch3_rndvtransfer.c
> Hunk #1 FAILED at 491.
> 1 out of 1 hunk FAILED -- saving rejects to file ch3_rndvtransfer.c.rej
> patching file ibv_recv.c
> Hunk #1 FAILED at 412.
> 1 out of 1 hunk FAILED -- saving rejects to file ibv_recv.c.rej
> patching file mpidi_ch3_rdma_post.h
> Hunk #1 FAILED at 201.
> 1 out of 1 hunk FAILED -- saving rejects to file mpidi_ch3_rdma_post.h.rej
> patching file ibv_send.c
> Hunk #1 FAILED at 256.
> Hunk #2 FAILED at 843.
> Hunk #3 FAILED at 854.
> 3 out of 3 hunks FAILED -- saving rejects to file ibv_send.c.rej
>
>
> Am 01.09.2011 17:26, schrieb Devendar Bureddy:
>> Hi Tibor
>>
>> Sorry for that. Unfortunately It was generated against latest
>> development code base.  Can you please try this attached patch.
>>
>> Thanks
>> Devendar
>>
>> On Thu, Sep 1, 2011 at 11:07 AM, Tibor Pausz
>> <pausz at th.physik.uni-frankfurt.de> wrote:
>>> Hi Devendar,
>>>
>>> I can't apply the patch to 1.7rc1. The hunks are all rejected.
>>>
>>> Best regards,
>>> Tibor
>>>
>>>
>>> Am 18.08.2011 22:59, schrieb Devendar Bureddy:
>>>> Hi Tibor
>>>>
>>>> Thanks for your sample program. There is a interger over flow inside
>>>> the library.  Can you please apply attached patch and try again.
>>>> Please do  a "make clean" and "make && make install" after applying
>>>> the patch.
>>>>
>>>> BTW, You have declared a large memory for buffer2 but not used it in
>>>> your sample program.
>>>>
>>>> Thanks
>>>> Devendar
>>>>
>>>> On Thu, Aug 18, 2011 at 10:26 AM, Tibor Pausz
>>>> <pausz at th.physik.uni-frankfurt.de> wrote:
>>>>> Hi Devendar,
>>>>>
>>>>> a have included the small program which I used.
>>>>>
>>>>> Best regards,
>>>>> Tibor
>>>>>
>>>>>
>>>>> Am 16.08.2011 16:01, schrieb Devendar Bureddy:
>>>>>> Hi Tibor
>>>>>>
>>>>>> Thanks for using MVAPICH2.  In our testing we are able to transfer
>>>>>> larger than 2GB.  The internal warning message  which you got
>>>>>> indicates that there is a mismatch between send, recv buffer sizes.
>>>>>> Can you please share your reproducible program, so that it will be
>>>>>> easy for us to debug further.
>>>>>>
>>>>>> Thanks in advance
>>>>>> Devendar
>>>>>>
>>>>>> On Tue, Aug 16, 2011 at 4:13 AM, Tibor Pausz
>>>>>> <pausz at th.physik.uni-frankfurt.de> wrote:
>>>>>>> Hi there
>>>>>>>
>>>>>>> I have installed mvapich2 1.7rc1 with this configure options
>>>>>>> ./configure --enable-fc --enable-cxx --with-hwloc
>>>>>>> --with-slurm=.../slurm/2.2.7 --with-rdma=gen2 --enable-shared
>>>>>>> --enable-sharedlibs=gcc --with-xrc
>>>>>>> Compiler Intel 11.1, Scientific Linux 5.5
>>>>>>>
>>>>>>> Now im trying to transfer with MPI_Ssend arrays which are larger than
>>>>>>> 2GB, and a got the following message.
>>>>>>> Warning! Rndv Receiver is expecting 0 Bytes But, is receiving 0 Bytes
>>>>>>>
>>>>>>> After that the program just hangs. I have tried "int" and "long" as type
>>>>>>> of argument "count". But I have allways the same warning.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Tibor
>>>>>>> _______________________________________________
>>>>>>> mvapich-discuss mailing list
>>>>>>> mvapich-discuss at cse.ohio-state.edu
>>>>>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>>>>>
>>>>> _______________________________________________
>>>>> mvapich-discuss mailing list
>>>>> mvapich-discuss at cse.ohio-state.edu
>>>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>>>
>>>>>
>>> _______________________________________________
>>> mvapich-discuss mailing list
>>> mvapich-discuss at cse.ohio-state.edu
>>> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>>
>
> _______________________________________________
> mvapich-discuss mailing list
> mvapich-discuss at cse.ohio-state.edu
> http://mail.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>



More information about the mvapich-discuss mailing list