[mvapich-discuss] how to build mvapich2 so to make it work on SR-IOV + KVM Enviroment

Xiaoyi Lu lu.932 at osu.edu
Mon Mar 12 13:48:09 EDT 2018


Thanks. Let’s take a look and get back to you later.

Xiaoyi

Sent from my iPhone

> On Mar 12, 2018, at 10:44 AM, Pharthiphan Asokan <pasokan at ddn.com> wrote:
> 
> Hi  Xiaoyi,
> 
> # /usr/libexec/qemu-kvm --version
> QEMU emulator version 1.5.3 (qemu-kvm-1.5.3-141.el7_4.6), Copyright (c) 2003-2008 Fabrice Bellard
> #
> 
> libvirt-3.2.0-14.el7_4.9.x86_64
> 
> 
> Regards,
> Pharthiphan
> From: Xiaoyi Lu [lu.932 at osu.edu]
> Sent: Monday, March 12, 2018 11:11 PM
> To: Pharthiphan Asokan
> Cc: mvapich-discuss at cse.ohio-state.edu
> Subject: Re: [mvapich-discuss] how to build mvapich2 so to make it work on SR-IOV + KVM Enviroment
> 
> Hi, Pharthiphan,
> 
> What are your KVM and libvirt version numbers?
> 
> Xiaoyi
> 
> Sent from my iPhone
> 
> On Mar 12, 2018, at 10:29 AM, Pharthiphan Asokan <pasokan at ddn.com> wrote:
> 
>> I tried to do this manully and got following error
>> 
>> # virsh qemu-monitor-command vcn01 --pretty '{"execute":"device_add","arguments":{"driver":"ivshmem","id":"hpivshmem","shm":"g290da24","size":"256m"}}'
>> {
>>   "id": "libvirt-44",
>>   "error": {
>>     "class": "GenericError",
>>     "desc": "'ivshmem' is not a valid device model name"
>>   }
>> }
>> 
>> From: Pharthiphan Asokan
>> Sent: Monday, March 12, 2018 10:28 PM
>> To: Xiaoyi Lu
>> Cc: mvapich-discuss at cse.ohio-state.edu
>> Subject: RE: [mvapich-discuss] how to build mvapich2 so to make it work on SR-IOV + KVM Enviroment
>> 
>> Hi Xiaoyi,
>> 
>> virtual machines are not deployed through openstack
>> 
>> /opt/mvapich2/virt/2.2/share/mvapich2-virt/ivshmem-tools.sh attach  vcn01 g290da24  256
>> Returning instance name of server vcn01
>> /opt/mvapich2/virt/2.2/share/mvapich2-virt/ivshmem-tools.sh: line 50: nova: command not found
>> Returning hostname where server vcn01 is running on
>> /opt/mvapich2/virt/2.2/share/mvapich2-virt/ivshmem-tools.sh: line 54: nova: command not found
>> INSTANCE: 
>> HOST:    
>> Error: Returning instance and/or host failed, exit.
>> 
>> Regards,
>> Pharthiphan
>> From: Xiaoyi Lu [lu.932 at osu.edu]
>> Sent: Monday, March 12, 2018 10:07 PM
>> To: Pharthiphan Asokan
>> Cc: mvapich-discuss at cse.ohio-state.edu
>> Subject: Re: [mvapich-discuss] how to build mvapich2 so to make it work on SR-IOV + KVM Enviroment
>> 
>> Hi, Pharthiphan,
>> 
>> In the user guide I sent to you, can you search the following paragraph?
>> 
>> Step 3. Execute ivshmem-tools.sh to set up IVSHMEM devices for virtual machines.
>> 
>> We have a tool to help you to hotplug the IVSHMEM device.
>> 
>> Xiaoyi 
>> 
>> Sent from my iPhone
>> 
>> On Mar 12, 2018, at 9:28 AM, Pharthiphan Asokan <pasokan at ddn.com> wrote:
>> 
>>> Hi Xiaoyi,
>>> 
>>> How to configure IVSHMEM on existing KVM guests ? if some link can shared for some tips would be useful
>>> 
>>> Regards,
>>> Pharthiphan
>>> ________________________________________
>>> From: Pharthiphan Asokan
>>> Sent: Monday, March 12, 2018 9:48 PM
>>> To: Xiaoyi Lu
>>> Cc: mvapich-discuss at cse.ohio-state.edu
>>> Subject: RE: [mvapich-discuss] how to build mvapich2 so to make it work on SR-IOV + KVM Enviroment
>>> 
>>> Hi Xiaoyi
>>> 
>>> can you please help me with the complete commandline to configure IVSHMEM,I could get through this step.
>>> 
>>> To configure IVSHMEM for virtual machines, administrators can add a device model option to QEMU command line as follows.
>>> 
>>> -device ivshmem,shm=ivshmem-id,size=256m
>>> 
>>> [root at vcn03 pasokan]# /opt/mvapich2/virt/2.2/bin/mpirun_rsh -np 4 vcn03 vcn03 vcn04 vcn04  MV2_VIRT_USE_IVSHMEM=1 /mnt/lustre_client/pasokan/a.out
>>> Failed to find IVShmem device, will fallback to SR-IOV
>>> Failed to find IVShmem device, will fallback to SR-IOV
>>> Failed to find IVShmem device, will fallback to SR-IOV
>>> Failed to find IVShmem device, will fallback to SR-IOV
>>> [src/mpid/ch3/channels/mrail/src/gen2/rdma_iba_priv.c:1487] Could not modify qpto RTR
>>> [src/mpid/ch3/channels/mrail/src/gen2/rdma_iba_priv.c:1487] Could not modify qpto RTR
>>> [src/mpid/ch3/channels/mrail/src/gen2/rdma_iba_priv.c:1487] Could not modify qpto RTR
>>> [src/mpid/ch3/channels/mrail/src/gen2/rdma_iba_priv.c:1487] Could not modify qpto RTR
>>> Hello world from processor vcn03, rank 1 out of 4 processors
>>> Hello world from processor vcn03, rank 0 out of 4 processors
>>> failed while avail wqe is 63, rail 0
>>> [vcn03:mpi_rank_1][post_srq_send] src/mpid/ch3/channels/mrail/src/gen2/ibv_send.c:878: ibv_post_sr (post_send_desc): Invalid argument (22)
>>> failed while avail wqe is 63, rail 0
>>> [vcn03:mpi_rank_0][post_srq_send] src/mpid/ch3/channels/mrail/src/gen2/ibv_send.c:878: ibv_post_sr (post_send_desc): Invalid argument (22)
>>> [vcn03:mpispawn_0][readline] Unexpected End-Of-File on file descriptor 7. MPI process died?
>>> [vcn03:mpispawn_0][mtpmi_processops] Error while reading PMI socket. MPI process died?
>>> [vcn03:mpispawn_0][child_handler] MPI process (rank: 1, pid: 28583) exited with status 255
>>> [vcn03:mpispawn_0][child_handler] MPI process (rank: 0, pid: 28582) exited with status 255
>>> Hello world from processor vcn04, rank 3 out of 4 processors
>>> Hello world from processor vcn04, rank 2 out of 4 processors
>>> failed while avail wqe is 63, rail 0
>>> [vcn04:mpi_rank_3][post_srq_send] src/mpid/ch3/channels/mrail/src/gen2/ibv_send.c:878: ibv_post_sr (post_send_desc): Invalid argument (22)
>>> failed while avail wqe is 63, rail 0
>>> [vcn04:mpi_rank_2][post_srq_send] src/mpid/ch3/channels/mrail/src/gen2/ibv_send.c:878: ibv_post_sr (post_send_desc): Invalid argument (22)
>>> [vcn04:mpispawn_1][readline] Unexpected End-Of-File on file descriptor 6. MPI process died?
>>> [vcn04:mpispawn_1][mtpmi_processops] Error while reading PMI socket. MPI process died?
>>> [vcn04:mpispawn_1][child_handler] MPI process (rank: 3, pid: 27658) exited with status 255
>>> [vcn04:mpispawn_1][child_handler] MPI process (rank: 2, pid: 27657) exited with status 255
>>> [root at vcn03 pasokan]# [vcn04:mpispawn_1][report_error] connect() failed: Connection refused (111)
>>> 
>>> Thanks
>>> Pharthiphan
>>> ________________________________________
>>> From: Xiaoyi Lu [lu.932 at osu.edu]
>>> Sent: Monday, March 12, 2018 9:27 PM
>>> To: Pharthiphan Asokan
>>> Cc: mvapich-discuss at cse.ohio-state.edu
>>> Subject: Re: [mvapich-discuss] how to build mvapich2 so to make it work on SR-IOV + KVM Enviroment
>>> 
>>> Hi, Pharthiphan,
>>> 
>>> Thanks for your interest in our project.
>>> 
>>> For virtualized environments (like SR-IOV + KVM), we strongly recommend our users to use the MVAPICH2-Virt package. You can download it from http://mvapich.cse.ohio-state.edu/downloads/.
>>> 
>>> You can find the user guide of MVAPICH2-Virt from http://mvapich.cse.ohio-state.edu/userguide/virt/.
>>> 
>>> Thanks,
>>> Xiaoyi
>>> 
>>>>> On Mar 12, 2018, at 1:27 AM, Pharthiphan Asokan <pasokan at ddn.com> wrote:
>>>> 
>>>> Hi All,
>>>> 
>>>> how to build and run mvapich2 so to make it work on SR-IOV + KVM Enviroment
>>>> 
>>>> I tried the following, then ended up with Could not modify boot qp to RTR
>>>> 
>>>> [root at vcn01 ~]# /opt/ddn/mvapich/bin/mpirun -hosts vcn01,vcn02 -ppn 1 /opt/ddn/ior/bin/IOR-mvapich -a POSIX -w -r -t
>>>> 1m -b 1m -o /tmp/pasokan_fuse/test1
>>>> [src/mpid/ch3/channels/mrail/src/gen2/ring_startup.c:292] error(22): Could not modify boot qp to RTR
>>>> [src/mpid/ch3/channels/mrail/src/gen2/ring_startup.c:292] error(22): Could not modify boot qp to RTR
>>>> ===================================================================================
>>>> =
>>>> BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
>>>> =
>>>> PID 24752 RUNNING AT vcn02
>>>> =
>>>> EXIT CODE: 1
>>>> =
>>>> CLEANING UP REMAINING PROCESSES
>>>> =
>>>> YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
>>>> ===================================================================================
>>>> [proxy:0:0 at vcn01] HYDU_sock_write (utils/sock/sock.c:286): write error (Broken pipe)
>>>> [proxy:0:0 at vcn01] main (pm/pmiserv/pmip.c:265): unable to send EXIT_STATUS command upstream
>>>> [mpiexec at vcn01] HYDT_bscu_wait_for_completion (tools/bootstrap/utils/bscu_wait.c:76): one of the processes terminated
>>>> badly; aborting
>>>> [mpiexec at vcn01] HYDT_bsci_wait_for_completion (tools/bootstrap/src/bsci_wait.c:23): launcher returned error waiting
>>>> for completion
>>>> [mpiexec at vcn01] HYD_pmci_wait_for_completion (pm/pmiserv/pmiserv_pmci.c:218): launcher returned error waiting for
>>>> completion
>>>> [mpiexec at vcn01] main (ui/mpich/mpiexec.c:344): process manager error waiting for completion
>>>> [root at vcn01 ~]#
>>>> 
>>>> Regrads,
>>>> Pharthiphan
>>>> _______________________________________________
>>>> mvapich-discuss mailing list
>>>> mvapich-discuss at cse.ohio-state.edu
>>>> http://mailman.cse.ohio-state.edu/mailman/listinfo/mvapich-discuss
>>> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mailman.cse.ohio-state.edu/pipermail/mvapich-discuss/attachments/20180312/ae876eb1/attachment.html>


More information about the mvapich-discuss mailing list