*** iurygregory is now known as iurygregory|holiday | 02:26 | |
gibi | sean-k-mooney: +2 | 07:49 |
---|---|---|
dvo-plv_ | sean-k-mooney: Hello | 08:41 |
dvo-plv_ | I would like to clarify our blueprint status. https://review.opendev.org/c/openstack/nova-specs/+/868377 | 08:42 |
dvo-plv_ | Maybe community need something from us to start further activities | 08:44 |
sean-k-mooney | no just pinging me and others to reveiw. my time is currently quite limited for upstream work so my review bandwith has been reduced alot due to downstream work. | 09:48 |
sean-k-mooney | i think i was more or less ok with it the last time i looked so ill take a look again | 09:49 |
sean-k-mooney | dvo-plv_: im happy with the spec as is. others might ask for more info but i think you have the main points. | 09:53 |
sean-k-mooney | i changed the topic in gerrit to bp/virtio-packedring-configuration-support | 09:54 |
sean-k-mooney | can you use the same topic for the code | 09:54 |
dvo-plv_ | okay, I will update topic for code. Currently it has https://review.opendev.org/q/topic:VirtIO_PackedRing | 10:19 |
dvo-plv_ | What does this topic means ? | 10:19 |
sean-k-mooney | we can use that on the spec | 10:29 |
sean-k-mooney | the topic is just wa way to group related patches together | 10:29 |
sean-k-mooney | so the code and spec should use the same one | 10:29 |
sean-k-mooney | by convention we use bp/<bluprint name> for features that are tracked by a bluepirnt or spec | 10:30 |
sean-k-mooney | and bug/<bug number> for bugs | 10:30 |
sean-k-mooney | otherwise its freeform | 10:30 |
sean-k-mooney | for feature that require changes in multiple project we try to use the same topic across all of them to be able to see all reated patches quickly | 10:31 |
sean-k-mooney | so now that you updated it https://review.opendev.org/q/topic:bp%252Fvirtio-packedring-configuration-support | 10:31 |
sean-k-mooney | we can see the spec, nova and os-traits patches all in one view | 10:32 |
sean-k-mooney | that tells me you are missing a patch to glance to update the metadefs https://github.com/openstack/glance/blob/master/etc/metadefs/compute-libvirt.json | 10:33 |
sean-k-mooney | dvo-plv_: ^ the metadefs are what horizon and heat use to generate the drop down for adding traits automatically | 10:33 |
sean-k-mooney | sorry not triats extra_specs and image properties | 10:34 |
dvo-plv_ | okay, I will investigate and fix | 10:35 |
sean-k-mooney | you basically jsut need to copy this https://github.com/openstack/glance/blob/master/etc/metadefs/compute-libvirt.json#L30-L35 | 10:36 |
sean-k-mooney | updatign mem_encyption to packed_ring | 10:36 |
sean-k-mooney | the namespce/prefix is automaticaly handeled by https://github.com/openstack/glance/blob/master/etc/metadefs/compute-libvirt.json#L7-L16 | 10:37 |
dvo-plv_ | okay, thanks. I would liek to check it on the horizon ui too | 10:37 |
sean-k-mooney | its also good to update the userful image properies doc https://github.com/openstack/glance/commit/3a281b9bc62a1b8b0f1468bc641105a5662f8ecd is an example | 10:37 |
dvo-plv_ | sure | 10:38 |
dvo-plv_ | Could you also clarify me with your scheduler according to the upstream work? We would like to start work over multiqueue support for hardware offloading | 10:38 |
sean-k-mooney | other can review your code not jsut me. but currently i have around 20% of my time for upstream work less fo the last 3-4 weeks because i was busy with downstream planing activies due to the ptg and other issues | 10:39 |
sean-k-mooney | i have been trying to capture all the outcomes of the ptg in our downstream jira and plannign the work for our team there | 10:40 |
sean-k-mooney | normally i have more time for upstream review so that shoudl go back to normal soon | 10:41 |
sean-k-mooney | regarding multi queue | 10:41 |
sean-k-mooney | do you mean https://review.opendev.org/c/openstack/nova-specs/+/855514 | 10:42 |
dvo-plv_ | Whom I should ping regarding review process. I saw gibi on some review process | 10:42 |
dvo-plv_ | Yes, we would like to analyze it better and provide some vision, how we can improve this in the OpenStack for hardware offloading, expecially for our nic | 10:43 |
sean-k-mooney | so multi queue is only partly implemented for hardware offload | 10:43 |
sean-k-mooney | i dont think it actully works we cannot enable it for nics that use sriov | 10:44 |
sean-k-mooney | i.e. nics that present the vf directly to the guest | 10:44 |
sean-k-mooney | but it might eb possibel for macvtap or vdpa | 10:44 |
sean-k-mooney | dvo-plv_: in terms of pings the active member of the nova core team | 10:45 |
sean-k-mooney | dvo-plv_: so gibi, bauzas, melwitt, gmann and dansmith are you best bet in addtion to me. stephenfin is also around somethimes but they mainly work on non nova related thigns day to day | 10:46 |
dvo-plv_ | okay, thanks | 10:47 |
sean-k-mooney | so looping back to multi queue | 10:49 |
sean-k-mooney | implementing this in the way you wanted is not going to be easy or really desireable form a nova point of view | 10:50 |
sean-k-mooney | there are a few parts to this problem | 10:51 |
sean-k-mooney | first we need to detach and recored the number of queues avaialbel in the vf | 10:51 |
sean-k-mooney | second we need to be able to schudle based on that (either updatign the pci filter or recording this in placement) | 10:52 |
sean-k-mooney | third we need a way to request a device with a min number of queuse multiup queuse out side the falvoar/image (likely on the nutron port) | 10:53 |
sean-k-mooney | finally we need to take the resouce request form teh port and include that in our scueduling reeust and wonce we find a host/device that meets that need we need to ensure that the qemu device is cofnigured correctly | 10:54 |
dvo-plv_ | regarding first question, we investigated it and the best option on our opinio is to parse other config. we configure queues like that -a 0000:65:00.0,representor=[4-6],portqueues=[4:2,5:2,6:2]. Yes it will work only for our nic | 10:54 |
sean-k-mooney | which config | 10:54 |
sean-k-mooney | the pci config space | 10:55 |
dvo-plv_ | ovs. other_config | 10:55 |
sean-k-mooney | you can useuslly get this form sysfs i tought | 10:55 |
sean-k-mooney | we cant do that | 10:55 |
sean-k-mooney | the ovs port wont exist at that point | 10:55 |
dvo-plv_ | no we can not, this is untrivial task for our dpdk driver | 10:55 |
sean-k-mooney | oh right so honestly you cant start this work | 10:56 |
sean-k-mooney | until the basic work of supporting the dpdk represtors is done | 10:56 |
dvo-plv_ | ovs port not, but vf yes. We would like to parse this config and fill it to the device_spec | 10:56 |
sean-k-mooney | the ovs port will be created and added by os-vif | 10:56 |
sean-k-mooney | only after we ahve selected a vf | 10:57 |
sean-k-mooney | so i think we need https://review.opendev.org/c/openstack/nova-specs/+/859290 to be done before we can talk about multiqueue | 10:57 |
dvo-plv_ | I see, we thought we can start to find solution for all comments to the blueprint in the parallel at the moment | 10:59 |
sean-k-mooney | well we could but its going to be diffuctly to compelte jsut one of the 3-4 specs you have propsoed this cycle | 10:59 |
sean-k-mooney | maybe 2 | 10:59 |
sean-k-mooney | its very unlikely that all of them will land | 11:00 |
dvo-plv_ | You mentined that multiqueue functional will be possible for vdpa. So maybe it will be better to move from virtio-forwarder to the vdpa nvic type for future purposes | 11:00 |
sean-k-mooney | well there is work in dpdk to supprot vdpa | 11:01 |
sean-k-mooney | i was expecting to have a vdpa-user type at some point for that | 11:01 |
sean-k-mooney | https://doc.dpdk.org/guides/vdpadevs/features_overview.html | 11:02 |
sean-k-mooney | i have not looked into it much | 11:02 |
sean-k-mooney | i dont think that is supported by ovs-dpdk currenlty but i have not really been following it closely | 11:03 |
sean-k-mooney | dvo-plv_: so for the basic enablement we are going to be trackign napatec VF which we will add to ovs as dpdk prots corret | 11:04 |
sean-k-mooney | and then those will be exposed to the guest as vhost-user ports | 11:04 |
sean-k-mooney | so for multi queue we would need to read the number of queues on the vf ideally | 11:05 |
dvo-plv_ | This multiqueue functional with queue mq and vector is in the our ovs fork at the moment | 11:05 |
sean-k-mooney | because we need that info for schduling | 11:05 |
sean-k-mooney | ok so thats kind of a problem | 11:05 |
sean-k-mooney | we do not really allow enablment of forked functionality in nova | 11:05 |
dvo-plv_ | Yes, I remember that it requires for placement to handle scheduler with queues number | 11:06 |
sean-k-mooney | if we can do it generically we enabel it so i was hoppng we coudl do somehting liek read /sys/bus/pci/device/<address>/num_queus or somehting like that | 11:06 |
sean-k-mooney | ideally vai libvirt nodedev api | 11:07 |
sean-k-mooney | not reading sys directly | 11:07 |
sean-k-mooney | so you can get the queue like this https://paste.opendev.org/show/bVSM5IDtJTRwhcIcuTFs/ | 11:09 |
sean-k-mooney | that is a pf | 11:09 |
sean-k-mooney | but i belive the same is true for VFs | 11:09 |
sean-k-mooney | i done see the queues in libvirt https://paste.opendev.org/show/bJHNfZR8JNpxJVvLggCI/ | 11:11 |
sean-k-mooney | so the first step woudl really be to add the ablityu to get the queus form libvirt to libvirt | 11:12 |
sean-k-mooney | dvo-plv_: do you need the VFs to be bound to vfio-pci | 11:14 |
sean-k-mooney | i assume use so i geuss this infor will not be aviable via the vf since it wont have a netdev | 11:15 |
dvo-plv_ | we probe vfio-pci driver modprobe vfio-pci enable_sriov=1 | 11:16 |
dvo-plv_ | ane then allocate vf echo "$NUMVFS" > /sys/bus/pci/devices/0000:$BUS:00.0/sriov_numvfs | 11:16 |
sean-k-mooney | yep thats pretty standard for dpdk although the enable_sriov bit is relitvly recent | 11:17 |
dvo-plv_ | we don not have netdev devices for that. so this is hard to get queue at linux layer | 11:17 |
sean-k-mooney | ya | 11:17 |
sean-k-mooney | so the probelm is the device spec is not intended for configuration | 11:17 |
sean-k-mooney | it was orgianly just for filtering | 11:17 |
sean-k-mooney | we have since added some metadta to it | 11:18 |
sean-k-mooney | im not sure hwo peopel would fell about adding the number of queues | 11:18 |
dvo-plv_ | yes, so this is why we firstly decided that user can fill device_spec with additional rapameter for filtering | 11:18 |
dvo-plv_ | queue_number | 11:18 |
sean-k-mooney | dvo-plv_: right so that approch has been rejected in the past | 11:19 |
dvo-plv_ | yes | 11:19 |
sean-k-mooney | i mean before you propsoed it | 11:19 |
sean-k-mooney | there have been attempts to do this in teh past and it was rejected | 11:19 |
sean-k-mooney | that said we now have enough things like this that we might be ok with it | 11:20 |
sean-k-mooney | we now have things liek remote_managed and resouce class | 11:20 |
dvo-plv_ | I see. now you would like to see that queue parameter gets automatically like metadata and placemnet filter nodes according to the required queue. But you would not like to increase resource provider | 11:20 |
sean-k-mooney | so addign queue_pairs=<count> might be ok | 11:20 |
sean-k-mooney | am no we can model this in placment but it would need use to have one RP per vf | 11:21 |
sean-k-mooney | which is not soemthgn we wanted to do if we coudl avoid it | 11:22 |
sean-k-mooney | we would need to adress the placment scaling bug first | 11:22 |
sean-k-mooney | dvo-plv_: https://review.opendev.org/c/openstack/nova/+/855885 | 11:23 |
sean-k-mooney | so we could not track this in placment initally | 11:23 |
sean-k-mooney | we would have to track this in nova and use the pci filter to filter based on the queus | 11:23 |
sean-k-mooney | eventully it could be done in placement but we also need to start trackign neutron consumabel pci devices in placemnet before that | 11:24 |
sean-k-mooney | the only workable solution i see in the next 6-12 months is to do this in nova | 11:25 |
sean-k-mooney | if we require all VFs in the same pool to have the same queue count | 11:26 |
sean-k-mooney | then we can add the queue_pair count to the extra_info on the pci_device in the nova db | 11:26 |
sean-k-mooney | and the pci_passhtough filter can use that | 11:26 |
dvo-plv_ | lets assume that we have already dealt with the automatic queue getting. Lets use traits config. if this node has vf with 2 and 3 queues. Placemnet will add traits 2_queus and 3_queues to the scheduler to fitler node by queue. and than, when node is choosen, nova and choose appropariate vf by queue number | 11:28 |
sean-k-mooney | no | 11:28 |
sean-k-mooney | thsi is not a correct use of traits | 11:28 |
sean-k-mooney | traits cannot be used for Quantitative aspect of a resuce i.e the number of queuse or frequency of a cpu | 11:29 |
sean-k-mooney | HW_NIC_MULTIQUEUE is an accpaable trait which we already have https://github.com/openstack/os-traits/blob/master/os_traits/hw/nic/__init__.py#L18 | 11:30 |
sean-k-mooney | but 2_queus is not | 11:30 |
sean-k-mooney | Quantitative aspects must eb tracked as inventories in a resouce provider | 11:31 |
sean-k-mooney | dvo-plv_: we cannot currently track neutron consumable pci devices in placment by the way | 11:32 |
sean-k-mooney | vmaccel: has said they want to work on that this cycle | 11:32 |
sean-k-mooney | so for bobcat it would be risky to asume that work woudl be compelte in time for multi queu to be implemetned | 11:33 |
dvo-plv_ | does it will be part of this spec ? https://specs.openstack.org/openstack/nova-specs/specs/2023.1/implemented/pci-device-tracking-in-placement.html | 11:33 |
sean-k-mooney | dvo-plv_: no that spec epxlictly dose not support any pci device that can be use via neutron | 11:33 |
dvo-plv_ | so, the main problem taht we can not filter specific node fro the pool with required vf queue number, right ? | 11:37 |
sean-k-mooney | we can solve that todya with the pci_passhtough_filter | 11:37 |
sean-k-mooney | as i said there are 3-4 peice that need to be done | 11:38 |
sean-k-mooney | 1 recored the number of quesue (add queue_pairs to devspec and store it in pci_divice.extra_info.network_caps.queu_pairs) | 11:39 |
sean-k-mooney | 2 add a new extention to neutron to queueu queue pairs | 11:39 |
sean-k-mooney | 3 modify the pci pasthough filter to use the neutron request to fine a vf that fullfiles the need | 11:40 |
dvo-plv_ | i will be positive and believe that we can solve all of it) | 11:40 |
sean-k-mooney | 4 update teh libvirt generation to use queues=min(vf.queues,flavor.vcpus) | 11:40 |
sean-k-mooney | we can but you need to file a spec for the neutorn api extention and implement that before we can do the nova part | 11:41 |
dvo-plv_ | I have a question regarding neutron extension | 11:42 |
sean-k-mooney | sure | 11:42 |
sean-k-mooney | we might be able to use the neutron port tag extention by the way | 11:43 |
sean-k-mooney | but we need a way to say either that this port need multi queue or it need multi queue with at least X queues | 11:43 |
sean-k-mooney | we could initally skip that i guess | 11:45 |
sean-k-mooney | and just use hw_vif_multiqueu | 11:45 |
sean-k-mooney | but i fear that woul break things so i think that would be a bad idea | 11:45 |
dvo-plv_ | maybe it will more logical to extend neutron port with some additional parameter openstack port create --network net10 < --queues=2 or --binding-profile queues=2 > ... port10 | 11:45 |
sean-k-mooney | no binding profie is not a user facing field | 11:45 |
sean-k-mooney | it is for nova to pass info to the network backend | 11:46 |
sean-k-mooney | it shoudl never be written by a human or neutron | 11:46 |
dvo-plv_ | use flavor or image is not a soklution, because we could have vm with 2 port and differnet queues number. or leave it like a limitation | 11:46 |
sean-k-mooney | ya so this is whyi think we need a per port solution which means a neutron api extenion | 11:47 |
sean-k-mooney | or reusing an exsiting one | 11:47 |
sean-k-mooney | --binding-profile cant be used as makign that writabel is a security risk | 11:47 |
dvo-plv_ | sorry, Im not familiar with it. Do you means that https://docs.openstack.org/neutron/latest/contributor/internals/api_extensions.html | 11:48 |
sean-k-mooney | kind of so https://github.com/openstack/neutron-lib/tree/master/neutron_lib/api/definitions are all the api extenstiosn that neutron supprots | 11:49 |
sean-k-mooney | the binding profile for example is part of the portbindings api extention https://github.com/openstack/neutron-lib/blob/master/neutron_lib/api/definitions/portbindings.py#L31-L34 | 11:50 |
dvo-plv_ | okay, I will investiagate it. regarding this extension I should create rfe and talk with ralonsoh regarding t hat , right ? | 11:50 |
sean-k-mooney | yes | 11:51 |
sean-k-mooney | i see two possibale approches | 11:51 |
sean-k-mooney | either model this as part fo the QOS extentions | 11:51 |
sean-k-mooney | or as a sepereate multiqueue extension | 11:51 |
sean-k-mooney | if we add a new one | 11:51 |
dvo-plv_ | Thank you, I have alot of work now | 11:55 |
sean-k-mooney | one exsiting api we might be able to use is https://docs.openstack.org/neutron/latest/contributor/internals/tag.html | 11:55 |
sean-k-mooney | we could use that for hw_vif_multiqueue=true|false for example on a per port basis | 11:56 |
sean-k-mooney | the issue is its currently a string field and we woudl really prefer it to be a key value field | 11:56 |
sean-k-mooney | well a dict of key values | 11:57 |
sean-k-mooney | we have 3 or 4 usecasue that woudl benifit form a v2 of this feature that wsa key value | 11:57 |
sean-k-mooney | we spoke to ralonsoh about that durign the ptg | 11:58 |
sean-k-mooney | so tha might just be the best thign to do | 11:58 |
sean-k-mooney | then you coudl do {nova_min_queues:2,nova_multiqueue:true} on a per port basis | 11:59 |
sean-k-mooney | we coudl also do use it for nova_delete_on_detach: true|false | 12:00 |
dvo-plv_ | but this parameter is relayted to the flavor and image | 12:01 |
ralonsoh | sorry, I'm having l;unch now | 12:09 |
ralonsoh | I'll read this channel later | 12:09 |
opendevreview | Artom Lifshitz proposed openstack/nova master: Fix pep8 errors with new hacking https://review.opendev.org/c/openstack/nova/+/874517 | 13:56 |
opendevreview | Artom Lifshitz proposed openstack/nova master: Fix pep8 errors with new hacking https://review.opendev.org/c/openstack/nova/+/874517 | 14:28 |
opendevreview | Stephen Finucane proposed openstack/nova master: docs: Correct a typo, grammar in AZ doc https://review.opendev.org/c/openstack/nova/+/881235 | 15:20 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!