opendevreview | OpenStack Proposal Bot proposed openstack/nova master: Imported Translations from Zanata https://review.opendev.org/c/openstack/nova/+/891648 | 04:17 |
---|---|---|
*** elodilles_pto is now known as elodilles | 06:41 | |
dvo-plv | sean-k-mooney, Hello. Are you here ? | 07:36 |
gokhani | hello folks, ı am trying to migrate from vmware to openstack and my vm stuck at booting from harddisk. I converted from vmdk to raw. I followed https://superuser.openinfra.dev/articles/how-to-migrate-from-vmware-and-hyper-v-to-openstack/ guide. my fstab file on my vm is https://paste.openstack.org/show/bqO5mJ9gCpyHEEBgvMjZ/. what can be reason of stucking at boot from harddisk ? | 09:16 |
gokhani | vm is ubuntu 20.04 and cloudinit package is installed | 09:16 |
bauzas | good morning Nova | 10:32 |
* bauzas is eventually back at work | 10:32 | |
sean-k-mooney | o/ | 10:32 |
sean-k-mooney | i would say a lot has changed since you were last here but until last week the ci has been mostly unable to merge thigns so... | 10:33 |
bauzas | argh | 10:34 |
sean-k-mooney | gmann: dansmith and other have fortunetly managed to improve that of late but the reality is very little had merge in the early part of your pto | 10:34 |
bauzas | I haven't yet looked at gerrit | 10:34 |
bauzas | but this afternoon, I'll do | 10:35 |
bauzas | this morning, paperwork + email scrubbing :( | 10:35 |
sean-k-mooney | i feel like configing that to morning is probably ambious | 10:35 |
bauzas | tbh, I need way more coffee before looking at the gate :) | 10:39 |
sean-k-mooney | its relitivly fine at the moment | 10:41 |
sean-k-mooney | bar the normaly volume detach issues which i think will be a ptg topic | 10:41 |
gibi | bauzas: o/ welcome back. we did not pass the bug triage baton around during your PTO, I was lazy to find a candidate during the meeting. Otherwise the meetings were held and producitve as much as summer period allows. | 11:34 |
bauzas | ++ thanks for the offer | 11:34 |
dvo-plv | sean-k-mooney, What do you think regarding multi nic support in the our patch by adding additional configuration via config file or even better using binding profile | 12:42 |
sean-k-mooney | hard no on config file | 12:43 |
sean-k-mooney | or user specifying data in the binding profile | 12:43 |
sean-k-mooney | well it depends on where the config file would be used, whats in it and how that is confiugred | 12:44 |
sean-k-mooney | but i dont think that likely to work. | 12:44 |
dvo-plv | get configuration here to generate correct socket name https://review.opendev.org/c/openstack/neutron/+/869510/13/neutron/plugins/ml2/drivers/openvswitch/mech_driver/mech_openvswitch.py#221 | 12:44 |
dvo-plv | and here to create correct vf number https://review.opendev.org/c/openstack/os-vif/+/859574/8/vif_plug_ovs/ovs.py#349 | 12:44 |
sean-k-mooney | note that binding_profile is expclivily for nova to pass data to the network backend | 12:44 |
sean-k-mooney | as in you woudl have someihng in the neurton l2 agents config file | 12:45 |
sean-k-mooney | which gets pass as part fo agent['configurations'] | 12:45 |
sean-k-mooney | that you need use to calulate the vf number | 12:45 |
sean-k-mooney | i dont really think that is the correct approch honestly | 12:47 |
dvo-plv | but we need to correlate somehow formula for socket and vf number calculating | 12:49 |
sean-k-mooney | how i think this shoudl work is likely nova shoudl gather the info required adn store it in the binidng_profile and then neutron shoudl use that to calualte the path | 12:49 |
dvo-plv | we use simple number for vf and socket name stdvio+vf number. Mellanox uses something like pf0vf0 formulation | 12:50 |
dvo-plv | sorry, what info required, where what entity we should parse to get this info | 12:50 |
sean-k-mooney | the vf number is only meainifn if you know the parent PF pci device | 12:51 |
sean-k-mooney | can you express what exactly is required to match | 12:52 |
sean-k-mooney | are you useing the vf number to identigfy the dpdk port or somehting like that | 12:54 |
sean-k-mooney | i.e. stdvio10 maps to dpdk10 | 12:54 |
dvo-plv | in our case to create correct socket, we need socket name base ( stdvio ) and vf number according to the formula "domain bus slot" * 8 + "func number" | 12:54 |
dvo-plv | I added comment here https://review.opendev.org/c/openstack/os-vif/+/859574/8/vif_plug_ovs/ovs.py#352 | 12:55 |
sean-k-mooney | that does not expalin why tha tis need and the algorhtim you implemted does nto match that | 12:55 |
sean-k-mooney | what the code does is take 0000:2b:01.2 then reduce it to 01.2 then split that to 01 and 2 | 12:57 |
dvo-plv | we reserved first 4 pci for pf needs | 12:57 |
sean-k-mooney | and do 8*1+2 | 12:57 |
sean-k-mooney | so your are discarding the domain and bus numbers then lookig only at the slot and function | 12:58 |
sean-k-mooney | so 0000:2b:01.2 and 0000:3b:01.2 will compute the same socket | 12:59 |
dvo-plv | yes, we use this approach to work with single nic support | 12:59 |
sean-k-mooney | right which i i dont think is suffent to move forward with this | 13:00 |
dvo-plv | multi nic support is in the development, so I would like to provide method, which can calculate correct vf number and socket for all vendors by passing formula to the binding profile | 13:00 |
sean-k-mooney | that feels like a hack | 13:01 |
sean-k-mooney | i also dont like the idea of effectlivly embedding a dsl to expres the formula which is then interpreted at run time | 13:01 |
sean-k-mooney | dvo-plv: lookign tat the ovs and dpdk docs | 13:04 |
sean-k-mooney | the way we add port representors is by pasing the PCI address of the PF + the represnetor vf offest | 13:04 |
sean-k-mooney | https://docs.openvswitch.org/en/latest/topics/dpdk/phy/?highlight=dpdk#representors | 13:04 |
sean-k-mooney | i.e. ovs-vsctl add-port br0 dpdk-rep3 -- set Interface dpdk-rep3 type=dpdk \ | 13:05 |
sean-k-mooney | options:dpdk-devargs=0000:08:00.0,representor=vf3 | 13:05 |
dvo-plv | yes, we do the same, but representor identifier is not fixed, so it depends on vendor dpdk driver | 13:06 |
sean-k-mooney | well thats kind of a problem | 13:06 |
sean-k-mooney | since the code you want to modify is mento be vendor neutral | 13:06 |
sean-k-mooney | meaning your are not allowewd to make vendor speciifc chagnes to it | 13:07 |
dvo-plv | this why, I would like to provide config via binding profile | 13:07 |
sean-k-mooney | no | 13:09 |
sean-k-mooney | thats not really an option | 13:09 |
sean-k-mooney | at least not without alot of other changes | 13:10 |
sean-k-mooney | so here https://review.opendev.org/c/openstack/os-vif/+/859574/8/vif_plug_ovs/tests/unit/test_plugin.py#440 | 13:10 |
sean-k-mooney | you are creatign the port doing effectivly | 13:10 |
sean-k-mooney | ovs-vsctl add-port br0 dpdk-rep3 -- set Interface dpdk-rep3 type=dpdk \ | 13:10 |
sean-k-mooney | options:dpdk-devargs=0000:08:00.0,representor=vf3 | 13:10 |
sean-k-mooney | when you say the representor identifyier are you referign to vf3 | 13:14 |
sean-k-mooney | or the vhost-user socket file name | 13:15 |
sean-k-mooney | dvo-plv: or both? | 13:15 |
dvo-plv | both | 13:15 |
sean-k-mooney | so the convention in ovs is the socket name should match the port name | 13:15 |
sean-k-mooney | at least for interfaces of type vhost-user and vhost-user-client | 13:16 |
sean-k-mooney | or whatever the ovs constats for those are | 13:16 |
dvo-plv | I believe it can be correct for dpdk port type too | 13:17 |
dvo-plv | but as far as there is no convention regarding name formulation, we have to hardcore socket name on the dpdk initialization step | 13:19 |
dvo-plv | we create vf and socket at the same time and then pass it to the some application to use | 13:19 |
sean-k-mooney | that not how thign will work in nova/openstack | 13:20 |
sean-k-mooney | the VFs woudl have to be staticaly allocated effectivly at boot time | 13:20 |
sean-k-mooney | what driver is the PF bound too? | 13:20 |
sean-k-mooney | its bound to a kernel dirver not DPDK yes? | 13:21 |
dvo-plv | we use default vfio | 13:22 |
sean-k-mooney | for the VFs or the PF | 13:22 |
sean-k-mooney | vfio-pci is ok for the vf but not the PF | 13:23 |
sean-k-mooney | based on https://doc.dpdk.org/guides-22.11/prog_guide/switch_representation.html#vf-representors i think the PF are bound to a kernel dirver and the VFs are boud to vfio-pci then managed by the dpdk userspace driver | 13:28 |
sean-k-mooney | althogh im not sure that is correct if i look at https://doc.dpdk.org/guides-22.11/prog_guide/switch_representation.html#basic-sr-iov | 13:31 |
sean-k-mooney | "A DPDK application running on the hypervisor owns the PF device, which is arbitrarily assigned port index 3" | 13:32 |
sean-k-mooney | honestly i think upstream ovs need to be modifed to take the vhost-user socket path as a pramter when we do the port add | 13:35 |
dvo-plv | At the beginning we about that here https://review.opendev.org/c/openstack/nova-specs/+/859290/4..18//COMMIT_MSG#b9 | 13:35 |
dvo-plv | so this is why we firstly created our mech driver | 13:35 |
dvo-plv | so, maybe correct way be better to get back our mech driver ? | 13:36 |
sean-k-mooney | if it needs to be vendor specific it woudl need to be out of tree both on the ml2 side and os-vif side | 13:37 |
sean-k-mooney | the reason im askign about the PF dirver by the way is to ensure libvirt can enumebrate the VFs | 13:38 |
sean-k-mooney | it cant do that as far as i am aware if the PF is added to dpdk | 13:38 |
dvo-plv | we use vfio-pci for pf and vf fucntions | 13:40 |
dvo-plv | we can observer our interfaces only via dpdk driver | 13:40 |
dvo-plv | dpdk ntnic-pmd | 13:40 |
sean-k-mooney | then this wont work with libvirt | 13:40 |
sean-k-mooney | since we will have no nodedevs for the VFs | 13:41 |
sean-k-mooney | so we will have not entrieds in the pci_deivce table | 13:41 |
sean-k-mooney | unless your driver is implemeting a SYSfs interface for them? | 13:42 |
dvo-plv | yes ,https://paste.opendev.org/show/b8UDrLC9e0iKrEJR6z6j/ | 13:45 |
sean-k-mooney | if you have that interface then we can use sysfs to lookup the vf number | 13:45 |
sean-k-mooney | which os-vif already has code for. | 13:46 |
sean-k-mooney | https://github.com/openstack/os-vif/blob/master/vif_plug_ovs/linux_net.py#L360-L378 | 13:48 |
dvo-plv | yes, I remember that I tried to use this algo, but it does not correct for us as far as I remmember for example we have this pci 000:2b:01.1/, so according to the fomrula vf number shoulde be 9, but sysfs calc it like virtfn5 | 13:49 |
sean-k-mooney | that sounds like a bug in your dpdk drvier then | 13:50 |
sean-k-mooney | lookign at that output it is the 6th VF so 5 is correct if 0 indexed | 13:52 |
sean-k-mooney | i think the best way to make this portabel and upstreamable would be to add the socket name as a dpdkarg wehn adding the port and use the data for sysfs | 13:56 |
sean-k-mooney | taht would require 2 changes to the dpdk driver but it would mena we would not have to calulate the socket path in a special way | 13:57 |
sean-k-mooney | so we can keep the neutron ml2 driver unmodifed at least in terms of the vhost-user path generation | 13:57 |
dvo-plv | this is not a bug, because we reserved first 4 pci for physical interfaces | 14:05 |
dvo-plv | So what about to get back to the separate ml2 plugin ? like Agilio ? | 14:10 |
dvo-plv | sorry, but i did not get your option with dpdk args | 14:15 |
dvo-plv | socket creates by dpdk pmd driver on the init moment | 14:15 |
dvo-plv | ...-a 0000:2b:00.0,representor=[4-6],portqueues=[4:1,5:1,6:1] -a 0000:2b:00.4 -a 0000:2b:00.5 -a 0000:2b:00.6" | 14:16 |
dvo-plv | we pass next other config to the dpdk driver | 14:16 |
dvo-plv | after that it create sockets with names stdvio+vfnumber | 14:16 |
sean-k-mooney | shoudnt the socket be created by qemu with dpdk as the clinet | 14:18 |
sean-k-mooney | when dpdk is the server then if the vswitch is restated it breaks network connectivty for all vms | 14:19 |
sean-k-mooney | this is why we moved to qemu server dpdk client mode many years ago | 14:19 |
dvo-plv | this is not an issue for us, after ovs restart and socket recreating, connectivity get back | 14:22 |
sean-k-mooney | qemu will not reconnect to the vhos-user socket if its recreated by dpdk | 14:22 |
sean-k-mooney | the socket FD will change | 14:23 |
sean-k-mooney | there is a way to make qemu do that i belive htat was added signifcinaly later but we dont configre that | 14:23 |
sean-k-mooney | we do not supprot genreating " <reconnect enabled='yes' timeout='10'/>" | 14:24 |
sean-k-mooney | https://libvirt.org/formatdomain.html#vhost-user-interface | 14:24 |
sean-k-mooney | so if qemu is runing in client mode and you restart ovs to do a package upgrae it will break network connectivnty for all guests until you hard reboot them | 14:26 |
sean-k-mooney | have you tested this in an openstack envionment? | 14:26 |
dvo-plv | this is qemu command, -chardev socket,id=char0,path=/usr/local/var/run/stdvio6,server we set the server on the qemu side | 14:26 |
sean-k-mooney | ok then qemu is runing in server mode whch means qemu is creating the socket not dpdk | 14:27 |
sean-k-mooney | dpdk is runing in client mode and connecting to the socekt create by qemu | 14:27 |
sean-k-mooney | that is the way we recommend you use vhost user | 14:27 |
sean-k-mooney | what im suggetign is we add the path to the dpdk args when os-vif creates the interface in ovs | 14:28 |
sean-k-mooney | and the driver shoudl use that instead of trying to calulate it | 14:28 |
dvo-plv | yes, we formulate socket name by pci address according to the formula, because we reserve 4 first pcis for physical ports | 14:29 |
sean-k-mooney | yep but your not actully passing the vf number to dpdk in teh representor arg | 14:30 |
sean-k-mooney | what you are pasing is the intger offset to add to the pf adress | 14:30 |
sean-k-mooney | that is why your expect virtfn5 -> ../0000:2b:01.1/ to be 9 | 14:31 |
dvo-plv | yes, and this offset can not give us ability to work with sysfs | 14:32 |
dvo-plv | + socket name like stdvio + vf_num | 14:32 |
sean-k-mooney | that not the vf_num | 14:33 |
sean-k-mooney | its the pci endpoint number | 14:33 |
sean-k-mooney | its a diffent thing | 14:33 |
dvo-plv | sure | 14:33 |
sean-k-mooney | really the dpdk driver should jsut ascp representor=vf5 + the pf address | 14:34 |
sean-k-mooney | and compute that internaly | 14:34 |
dvo-plv | we can get this offset from here in some way root@server23:~# cat /sys/bus/pci/devices/0000\:2b\:00.0/sriov_offset | 14:34 |
dvo-plv | 4 | 14:34 |
sean-k-mooney | if we can read that form sysfs | 14:36 |
sean-k-mooney | then we can drop the algother in os-vif and use the exisitng fucntion and add the ofset | 14:36 |
sean-k-mooney | that still does not really help for the ml2 driver | 14:36 |
sean-k-mooney | do you have a document that describes the step by step process of adding the representor netdevs to ovs-dpdk manually | 14:37 |
dvo-plv | one moment | 14:38 |
dvo-plv | https://docs.napatech.com/r/Getting-Started-with-Napatech-Link-VirtualizationTM-Software/Create-the-OVS-Provider-Bridge-and-Start-2-VMs | 14:39 |
sean-k-mooney | that is incompelte. it does not have the port add comands for the represtor prots | 14:41 |
dvo-plv | item 4 Add the dpdkvp0 virtual port to the br-int bridge: | 14:42 |
*** JasonF is now known as JaqyF | 14:54 | |
*** JaqyF is now known as JayF | 14:54 | |
dvo-plv | maybe we can create logic like with vhostuser_socket_dir, if vhostuser_socket_name is set, get this name from config + pci from sysfs? | 15:12 |
noonedeadpunk | hey folks. I wanna double-check one thing with you, that I'm not missing anything. So for volume to be attached to a VM with scsi bus - volume *must* be created from the image? | 16:59 |
noonedeadpunk | As I've found a spec to allow volume define that as well, but it was abandoned | 16:59 |
sean-k-mooney | the volume no | 17:11 |
sean-k-mooney | but the vm root disk must have hw_disk_bus=scsi | 17:12 |
sean-k-mooney | we only look at the metadata on the root disk be that a local disk or cinder volume root disk | 17:12 |
sean-k-mooney | and all other cinder volume must use the same disk bus | 17:12 |
sean-k-mooney | noonedeadpunk: so right now we do not support attching block devices with diffent busses | 17:13 |
sean-k-mooney | there is a way to ocationlaly make that work in a speicific edgecase but its not supported upstream | 17:13 |
noonedeadpunk | aha, ok, I see then | 17:14 |
sean-k-mooney | noonedeadpunk: we discussed in the last ptg addign a way to supprot per volume disk bueses | 17:14 |
sean-k-mooney | but no one actully worked on it | 17:14 |
noonedeadpunk | So if you found yourself in situation when 25 volumes are not enough - you should start from scratch kinda? | 17:14 |
sean-k-mooney | well. it depends | 17:15 |
sean-k-mooney | the vm i assume is using virtio-blk now | 17:15 |
noonedeadpunk | yup | 17:15 |
sean-k-mooney | is the vm bfv or local storage | 17:15 |
sean-k-mooney | and are you lookign for a admin solution or an end user one | 17:15 |
noonedeadpunk | I'm not sure if it's bfv or not as instance seem to be gone.... | 17:18 |
noonedeadpunk | but admin solution is fine | 17:18 |
sean-k-mooney | then you can use this https://docs.openstack.org/nova/latest/cli/nova-manage.html#image-property-set | 17:18 |
noonedeadpunk | (or I jsut can't find it somehow) | 17:18 |
noonedeadpunk | ugh... We're running xena (going to upgrade to 2023.1 in a month) | 17:18 |
noonedeadpunk | but that is really handy command I didn't know about | 17:19 |
noonedeadpunk | so thanks for pointing me to it! | 17:19 |
sean-k-mooney | its really ment for helping peopel upgrade | 17:19 |
sean-k-mooney | but it will update the ebeded image metadata | 17:19 |
sean-k-mooney | if its boot form volume | 17:19 |
sean-k-mooney | there is an other hack | 17:20 |
sean-k-mooney | tl;dr is using old micorvstion of the rebuild api | 17:20 |
sean-k-mooney | allow bfv guests to just update the image metadata | 17:20 |
sean-k-mooney | i.e. on microvsion where rebuild was not supproted for boot form volume | 17:20 |
sean-k-mooney | if you used the same image uuid we allowed the metadtaa to be updated | 17:21 |
noonedeadpunk | iirc rebuild is supported quite recently? like zed or 2023.1? | 17:21 |
sean-k-mooney | this is generally not that safe as it would break the vm in some caseses and in the new microvserion it will actully rebuild the volume | 17:21 |
sean-k-mooney | yes | 17:21 |
noonedeadpunk | ok, awesome, thanks a lot! | 17:22 |
sean-k-mooney | so preior to actully supproting it properly there was a poorly documented feature where rebuild to the same iamge "only for bfv guests" woudl not destory data and just update the metadata | 17:22 |
sean-k-mooney | but honestly htat should never have been a thing | 17:22 |
sean-k-mooney | we effectivly forgot it existed until we added real rebuild support | 17:23 |
noonedeadpunk | I can recall writing smth to the ML regarding that | 17:23 |
noonedeadpunk | but already forgot about this feature :D | 17:25 |
sean-k-mooney | having a behviaor for a normally distructive api not be only if its BFV is too easy to forget | 17:27 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!