15:00:07 <ralonsoh> #startmeeting neutron_qos
15:00:08 <openstack> Meeting started Tue Mar 10 15:00:07 2020 UTC and is due to finish in 60 minutes.  The chair is ralonsoh. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:09 <njohnston> o/
15:00:10 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:12 <openstack> The meeting name has been set to 'neutron_qos'
15:00:15 <ralonsoh> you can stay if you want here
15:00:24 <bcafarel> :)
15:00:45 <ralonsoh> the agenda is very light today
15:01:18 <ralonsoh> so let's start
15:01:20 <ralonsoh> #topic RFEs
15:01:28 <ralonsoh> #link https://bugs.launchpad.net/neutron/+bug/1476527
15:01:32 <openstack> Launchpad bug 1476527 in neutron "[RFE] Add common classifier resource" [Wishlist,Triaged] - Assigned to Igor D.C. (igordcard)
15:01:46 <ralonsoh> I'll send a mail to the ML to ask for developers
15:02:16 <ralonsoh> for now, this RFE is frozen
15:02:42 <ralonsoh> next one
15:02:43 <ralonsoh> #link https://bugs.launchpad.net/neutron/+bug/1858610
15:02:44 <openstack> Launchpad bug 1858610 in neutron "[RFE] Qos policy not supporting sharing bandwidth between several nics of the same vm." [Undecided,New]
15:03:01 <ralonsoh> we agreed in the last drivers meeting to wait for a SPEC or POC
15:03:41 <ralonsoh> this feature is quite complex so a PoC could help to decide if the architecture change is correct
15:04:09 <ralonsoh> there are no more RFEs in the list
15:04:19 <ralonsoh> do you have something else to add here?
15:04:46 <davidsha> me?
15:04:50 <ralonsoh> hi!
15:04:53 <davidsha> Hey!
15:04:56 <ralonsoh> please, go on
15:06:03 <ralonsoh> davidsha, do you have something to add here?
15:06:14 <davidsha> I think they only way this could make sense to me is if it's applied at the instance, so it would need a lot of collaboration with Nova
15:06:48 <ralonsoh> davidsha, you mean to implement the QoS on the instance?
15:07:01 <davidsha> for this particular use case yes
15:07:13 <ralonsoh> but this is beyond the orchestrator
15:07:18 <davidsha> Otherwise you need a QoS rule thats an aggregate of specific ports
15:07:23 <ralonsoh> the orchestrator should not interfere on the instance
15:07:31 <ralonsoh> yes, that's the point
15:07:31 <davidsha> Ya
15:07:46 <ralonsoh> how this qos could be implemented in the backend
15:07:58 <ralonsoh> in LB they are proposing to use the IFB block
15:08:16 <ralonsoh> the same as I used (with no success) in the min BW implementation for LB
15:08:44 <ralonsoh> but as commented in the last drivers meeting, we'll wait for a spec/poc
15:09:03 <davidsha> So you need to change the typical relationship between rule -> policy -> port to rule wrapping multiple ports?
15:09:05 <davidsha> kk
15:09:17 <ralonsoh> yes, that's another topic
15:09:21 <davidsha> kk
15:09:36 <ralonsoh> more related to the server: how are you going to handle this new rule
15:09:49 <ralonsoh> just for the VM ports? for all the ports related to this rule?
15:09:57 <ralonsoh> nothing has been decided yet
15:09:58 <davidsha> ya, I''m thinking API at the moment
15:10:00 <davidsha> kk
15:11:05 <ralonsoh> ok, let's move to the next topic
15:11:10 <ralonsoh> #topic Bugs
15:11:15 <ralonsoh> #link https://bugs.launchpad.net/neutron/+bug/1866039
15:11:16 <openstack> Launchpad bug 1866039 in neutron "[OVN] QoS gives different bandwidth limit measures than ml2/ovs" [High,In progress] - Assigned to Maciej Jozefczyk (maciej.jozefczyk)
15:11:44 <ralonsoh> I'm looking for the patch...
15:11:59 <ralonsoh> #link https://review.opendev.org/#/c/711048/
15:12:13 <ralonsoh> there is a small transient measuring the BW
15:12:22 <ralonsoh> the BW at the beginning is higher
15:12:43 <ralonsoh> not to avoid but to reduce the impact of this transient on the average BW
15:12:59 <davidsha> kk
15:12:59 <ralonsoh> the file size to transfer will be proportional to the BW speed
15:13:20 <ralonsoh> this way, the transmission time will be always the same, 10 secs aprox
15:13:31 <ralonsoh> this will improve all backends
15:13:44 <ralonsoh> (will improve all backends tests)
15:13:58 <ralonsoh> please, review the patch
15:14:04 <davidsha> ack
15:14:09 <ralonsoh> next one
15:14:10 <ralonsoh> #link https://bugs.launchpad.net/neutron/+bug/1864630
15:14:11 <openstack> Launchpad bug 1864630 in neutron "Hard Reboot VM with multiple port lost QoS " [Undecided,In progress] - Assigned to Nguyen Thanh Cong (congnt95)
15:14:29 <ralonsoh> I've +W the patch 1 hour ago
15:14:32 <ralonsoh> #link https://review.opendev.org/#/c/709687/
15:14:48 <ralonsoh> good catch, thanks for reporting this bug
15:15:24 <ralonsoh> and now we have a series of bugs, all of them related
15:15:44 <ralonsoh> #link https://bugs.launchpad.net/neutron/+bug/1863987
15:15:45 <openstack> Launchpad bug 1863987 in neutron "[OVN] Remove dependency on port_object" [Medium,In progress] - Assigned to zhanghao (zhanghao2)
15:15:49 <ralonsoh> #link https://bugs.launchpad.net/neutron/+bug/1863852
15:15:50 <openstack> Launchpad bug 1863852 in neutron "[OVN]Could not support more than one qos rule in one policy" [Medium,In progress] - Assigned to Taoyunxiang (taoyunxiang)
15:15:55 <ralonsoh> #link https://bugs.launchpad.net/neutron/+bug/1862893
15:15:56 <openstack> Launchpad bug 1862893 in neutron "[OVN]Updating a QoS policy for a port will cause a KeyEerror" [Low,In progress] - Assigned to zhanghao (zhanghao2)
15:16:30 <ralonsoh> the problem of those patches are that the patches are disconnected among them
15:16:40 <ralonsoh> we need to refactor the OVN QoS driver
15:17:07 <ralonsoh> to handle the QoS as a driver extension
15:17:23 <ralonsoh> we can't use the ML2 agent extension model because this is NOT an agent
15:17:39 <ralonsoh> I'm still working on this: https://review.opendev.org/#/c/711317/
15:18:35 <ralonsoh> this patch will move the QoS handling to this new extension
15:18:49 <ralonsoh> the main problem I'm facing now is how to handle the QoS policy update
15:19:18 <ralonsoh> in the agents, the policies and the rules where stored in a in-memory mapping
15:19:33 <ralonsoh> but I'm very reluctant to do the same in the server
15:19:52 <davidsha> This is the Neutron server you're talking about?
15:19:52 <ralonsoh> in HA we can't guarantee this mapping will be shared between instances
15:19:56 <ralonsoh> yes
15:20:18 <davidsha> Neutron server has access to the DB, could you not pull from that?
15:20:21 <ralonsoh> Neutron server, sorry, in opposition to Neutron agent (sriov, LB, OVS)
15:20:26 <ralonsoh> that's the point
15:20:41 <davidsha> kk
15:20:49 <ralonsoh> we receive this call from the qos_plugin
15:20:55 <ralonsoh> self.driver_manager.call(qos_consts.UPDATE_POLICY, context, policy)
15:21:20 <ralonsoh> at this point, the policy (and the rules, this method is also called when a policy rule is updated)
15:21:30 <ralonsoh> stored in the DB is the new one
15:21:41 <ralonsoh> and we lost any track of the previous one...
15:22:03 <davidsha> Ok
15:22:09 <ralonsoh> --> how can we "delete" the OVN QoS registers if we don't have the information of the old policy and rules?
15:22:12 <davidsha> Iy's the delta you need to have
15:22:17 <ralonsoh> yes
15:22:41 <ralonsoh> and I'm squeezing my brain to find a solution
15:23:11 <davidsha> Is there some kind of history for modifications to DB objects that we can reference?
15:23:16 <ralonsoh> nope
15:23:21 <davidsha> :/
15:24:23 <ralonsoh> anyway, I'll find something
15:24:51 <davidsha> Ya, OvN may need a mirror of the QoS tables.
15:25:12 <ralonsoh> I don't know if I can store this info in the OVN DB
15:25:39 <ralonsoh> no problem, I'll find something there
15:25:51 <ralonsoh> we still have another bug in the list
15:25:58 <ralonsoh> #link https://bugs.launchpad.net/networking-sfc/+bug/1853171
15:25:59 <openstack> Launchpad bug 1853171 in neutron "Deprecate and remove any "ofctl" code in Neutron and related projects " [Medium,In progress]
15:26:06 <ralonsoh> davidsha, do you have any update?
15:26:40 <davidsha> Yes, I've had a go at the refactor, but I've run out of time: https://review.opendev.org/#/c/711949/
15:27:07 <ralonsoh> no problem
15:27:20 <ralonsoh> once you stop working on this one, ping me
15:27:29 <davidsha> There seems to be some issue with ct_state and ct_mark, I'm not sure am I using it wrong, but the documentation implies it should work.
15:27:57 <ralonsoh> I'll try to reproduce it locally
15:28:44 <davidsha> The dynamic flow creation code in rules.py is only half migrated as well.
15:29:24 <ralonsoh> maybe this is going to be more complex than expected...
15:29:30 <davidsha> NXActions should be able to do conjunctions from what I read, I just haven't gotten that far to really know.
15:30:08 <davidsha> It shouldn't be too bad from here, my only concern is the ct_mark and ct_state issue, there could be a bug in Ryu and OS-Ken
15:30:42 <davidsha> It expects to recieve ints for those values, but the documentation says strings
15:31:48 <ralonsoh> do you know what part of the osken code is handling this?
15:32:54 <davidsha> I was looking at the code, I think it was in the 1.3 parser. the ct_state and ct_mark are part of nx_actions I believe?
15:33:40 <davidsha> https://github.com/openstack/os-ken/blob/9f1f1726d0b86a43df61bc22f6a8dec0f5c5b918/os_ken/ofproto/nicira_ext.py#L585
15:33:49 <davidsha> Ya it's listed as an int here
15:34:53 <ralonsoh> but you said before strings
15:35:03 <davidsha> https://github.com/openstack/os-ken/blob/9f1f1726d0b86a43df61bc22f6a8dec0f5c5b918/os_ken/tests/packet_data_generator3/gen.py#L111
15:35:04 <davidsha> Ya
15:35:25 <ralonsoh> CT_MARK_* are string
15:35:27 <ralonsoh> strings
15:35:42 <ralonsoh> hmmm maybe there you have your problem!
15:35:43 <davidsha> Ya
15:35:50 <davidsha> :P
15:36:07 <davidsha> So the constants need to be changed from strings to ints
15:36:19 <davidsha> to the corresponding ints*
15:36:29 <ralonsoh> maybe heheheh
15:36:49 <davidsha> I had to make a similar change for the ethertypes
15:37:14 <davidsha> They were being passed in as strings but needed to be ints @.@
15:38:15 <ralonsoh> ping me once you have next version
15:39:01 <davidsha> Thats my last version I'm afraid :/
15:39:16 <ralonsoh> ahhhh ok
15:39:27 <ralonsoh> I understand
15:39:29 <ralonsoh> sorry...
15:39:46 <davidsha> np, sorry I didn't get it all the way :P
15:40:40 <ralonsoh> ok, thanks a lot davidsha
15:41:05 <ralonsoh> do we have something else in the bug section?
15:41:22 <ralonsoh> #topic Open Discussion
15:41:30 <ralonsoh> I have nothing in the agenda
15:41:44 <ralonsoh> something to add here?
15:42:08 <ralonsoh> thank you all and see you in two weeks
15:42:10 <ralonsoh> #endmeeting