14:01:51 #startmeeting neutron_qos 14:01:52 Meeting started Wed Oct 5 14:01:51 2016 UTC and is due to finish in 60 minutes. The chair is ajo_. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:01:54 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:01:56 The meeting name has been set to 'neutron_qos' 14:02:06 let's leave some time for people to join 14:02:24 Hi! I am lurking today. 14:02:36 #chair njohnston 14:02:37 Current chairs: ajo_ njohnston 14:02:43 ack njohnston :) 14:03:19 I see a few missing people, please say hi, whoever is around for this meeting :-) 14:03:28 hi 14:03:33 hi rnoriega ;) 14:03:45 hi 14:03:50 hi ltomasbo !! :) 14:04:52 I heard that some people from Ericsson R&D would be joined because they're interested in contributing ;) 14:05:59 I miss slaweq, ralonso, and others hmmm 14:06:20 I wonder if it's worth having the meeting or if a email status update will be enough 14:06:36 rnoriega, ltomasbo thoughts? , shall we proceed, email, or wait? 14:07:00 * ajo_ got everyone bored of QoS :P 14:07:07 * njohnston is still here! 14:07:17 hi njohnston :=) 14:07:27 this was more or less the agenda I had ready: 14:07:28 #link https://etherpad.openstack.org/p/qos-meeting 14:09:52 ok, 10 minutes is enough, I'm going to summarize the status, 14:09:57 and wrap up this quickly 14:10:01 ajo_, good! 14:10:09 so 14:10:11 hey, sorry for being late! 14:10:15 #topic RFEs-approved 14:10:26 hi davidsha :) 14:10:45 #links https://bugs.launchpad.net/neutron/+bugs?field.tag=qos+rfe-approved+&field.tags_combinator=ALL 14:10:50 one is missing on that link, let me fix it 14:11:20 so we have three RFEs that have been approved, right? 14:12:33 ajo_, it's fixed...thanks 14:12:44 Yeah, I've toggled strict min bw to rfe-approved (from rfe-postponed) 14:13:03 but I'm unsure I have the right to do that, I have commented it, and said that we should start by spec there. 14:13:29 #link https://bugs.launchpad.net/neutron/+bug/1586056 (extended validation) 14:13:31 Launchpad bug 1586056 in neutron "[RFE] Improved validation mechanism for QoS rules with port types" [Wishlist,In progress] - Assigned to Slawek Kaplonski (slaweq) 14:13:40 that one is about fixing some technical debt we have, 14:13:56 to make sure any changes to policies are validated with the plugin or mechanism drivers, 14:14:10 or changes to ports or networks (attached policy changes) 14:14:31 since dataplane capabilities for QoS are quite heterogeneous 14:14:46 that depends on more tech-debt 14:15:04 #link https://review.openstack.org/#/c/351858/ (qos notification-driver to "driver") 14:15:12 ok 14:15:31 Some refactor over the current "qos driver" to make it more consistent and be able to do the improved validation properly 14:16:13 #link https://bugs.launchpad.net/neutron/+bug/1560961 (instance ingress bw limit) 14:16:15 Launchpad bug 1560961 in neutron "[RFE] Allow instance-ingress bandwidth limiting" [Wishlist,In progress] - Assigned to Slawek Kaplonski (slaweq) 14:16:25 this one depends on the two above ;) 14:16:30 this is like a train, sorry ;) 14:17:17 it does not physically depend on the others, but it's a constraint neutron-drivers has imposed on it, to make sure we fix the technical debt first, 14:17:21 which, makes sense IMHO 14:17:52 #link https://bugs.launchpad.net/neutron/+bug/1560963 (minimum bw egress -non-strict-) 14:17:54 Launchpad bug 1560963 in neutron "[RFE] Minimum bandwidth support (egress)" [Wishlist,Fix released] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) 14:18:09 that one is being worked out by Rodolfo Alonso 14:18:23 partly merged for newton (SR-IOV support) 14:18:42 and OVS & LB efforts are being developed to be merged in Ocata 14:18:47 let me look for the links, 14:18:52 I know ltomasbo is interested in this :) 14:18:58 ajo_, this last one is not strict bw support 14:19:01 yep, I was about to ask.. 14:19:06 right? 14:19:11 rnoriega, non-strict, right 14:19:34 ok 14:20:04 non-stric in the sense that it is not "nova-aware"? 14:20:18 correct 14:20:22 non scheduling aware 14:20:34 ok 14:20:36 #link https://review.openstack.org/318531 QoS minimum egress for ovs-agent 14:20:54 #link https://review.openstack.org/357865 QoS minimum egress for LinuxBridge 14:21:17 buy the great Rodolfo Alonso Hernandez!, we should get him on some hall of fame (/me blinks eye on njohnston ) 14:21:37 buy -> by :) 14:21:55 ajo_, lol! I wanted to buy a Rodolfo for me as well 14:22:06 he's awesome 14:22:16 :-) 14:22:25 and now, the hot topic 14:22:29 +1 14:22:49 #link https://bugs.launchpad.net/neutron/+bug/1578989 minimum bandwidth egress strict (scheduling aware) 14:22:50 Launchpad bug 1578989 in neutron "[RFE] Strict minimum bandwidth support (egress)" [Wishlist,Confirmed] 14:22:56 this feature builds on the other 14:23:22 and it's necessary to make sure no hypervisor is scheduled more "minimum bandwidth" than the specific nics can handle in total 14:23:48 so, it will be enforce at host level, right? 14:24:11 so, effectively enforcing SUM(port[i].min_bw) for port in hypervisor <= hypervisor.bw 14:24:13 more or less 14:24:18 if the bottleneck is at another point in the network, it will not be enforced, right 14:24:34 that's a simplified equation, as we need to consider the network over which the packets travel, and the direction too 14:24:49 ajo_, sorry for my ignorant question. But is there anything that could be done in parallel to accelerate the development of the strict min bw support? 14:24:58 like writing a spec 14:25:12 helping with some patches on the previous work 14:25:14 rnoriega, yes, let me get into that :) 14:25:21 ajo_, ok, thanks! 14:25:27 I will summarize now the steps to get there 14:25:33 ltomasbo, you're right 14:25:53 ltomasbo, we have no capability now to see how the network architecture physically is 14:26:03 I wonder if we could model that on a later step 14:26:13 and let the scheduler consume from there 14:26:21 for example, we have the switches dataplane capacity, etc... 14:26:25 interconnections between switches 14:26:31 that's a harder stone to chew 14:26:47 ajo_, any plans to tackle that? Perhaps in a VNF chain (as a first step) 14:26:54 IP connectivity /& routes for tunnels 14:27:22 ltomasbo, thinking of that problem causes me heartburn literally ;D 14:27:28 :D 14:27:44 ltomasbo, I'd be very happy if anybody wants to look at it 14:27:54 but we should build the foundations first :) 14:28:09 could be related to the EU project, but it will depend on the previous patches too 14:28:47 ltomasbo, is on a fancy NFV related project ;) 14:28:53 sorry, no comma :) 14:29:01 so 14:29:24 steps to get this RFE done (eventually) 14:29:44 I'd say it's an ocata & beyond effort, specially since Ocata cycle is shorter 14:29:55 and since we have dependencies on some extra work for nova 14:30:02 so 14:30:07 0) Writing a neutron spec including all the next steps in detail 14:30:39 I plan on tackling that, but I'm super happy if anybody wants to step in to write the barebones, and have me as Co-Author 14:31:01 since I must first kill the technical debt we have or drivers will kill me :) 14:31:35 such spec could contain some of the next steps: 14:31:39 1) Neutron collecting physnet/tunneling available bandwidth on every compute, with the option to override via config. 14:31:52 2) Neutron reporting such available bandwidth to the new nova placement API inventories in the form of NIC_BW__{ingress,egress} 14:32:28 that'd mean that neutron tells nova "this hypervisor has 10Gb total of NIC_BW_tenant_egress , 10Gb total of NIC_BW_tenant_ingress" as an example 14:32:35 for every hypervisor 14:32:47 also for external networks attached to compute nodes, etc.. 14:33:19 We'd make use of the new nova placement API: #link http://specs.openstack.org/openstack/nova-specs/specs/newton/approved/generic-resource-pools.html 14:33:28 but we'd be missing a key feature from nova 14:33:47 #link https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:master+topic:bp/custom-resource-classes 14:33:51 which is being worked out 14:34:01 Jaypipes++ 14:34:36 that is 14:34:37 3) Nova accepting custom resource types (NIC_BW_.....) now compute nodes report via this http api the CPU, DISK, RAM. 14:34:57 4) Changes in how Nova handles ports, by creating/fetching them before doing any scheduling. (this seems to be planned for Cellsv2) 14:35:15 that way, nova can know the requirements of the port (in form of bandwidth) before trying to schedule the instance 14:35:30 ajo_, why neutron needs to tell nova about static information (total NIC_BW)? 14:35:40 or with total you meant total_in_use? 14:35:41 ltomasbo, the alternative would be 14:35:49 nova find the qos_policy_id on port, as it's now 14:36:00 but then has to fetch the policy, and rules, and make an interpretation of such rules 14:36:12 which could be complex as they grow 14:36:53 so after discussing with the nova team, it seems that the more reasonable option was to give a chewed output for nova (when nova creates, or fetches a port) 14:37:10 so, if there is no QoS policy, there will be no information about the bandwidth of the hosts, right? 14:37:45 ltomasbo, I'm talking about the ports 14:37:56 when nova does a GET or POST of a port to neutron 14:38:14 we'd provide the breakdowns of NIC_BW per net/direction 14:38:24 and that'd be empty if there is no qos policy attached to the port 14:38:57 ahh, ok, I thought you were talking about host_bw 14:39:32 ah, no no :), that's on step 2, when neutron reports to nova placement API, 14:39:54 such reporting is static (not changing as we add or remove ports, it's the total available) 14:40:10 nova will be responsible for counting down/up available traffic for this guarantee 14:40:26 and that's all 14:40:26 ok, now it is clear! 14:41:07 rnoriega, any question about all this? :) 14:41:32 ajo_, not really! it was a very clear explanation of the current state... 14:41:35 ajo_, thanks!!! 14:41:49 ajo_++ 14:41:51 ltomasbo, rnoriega , anyone, you're welcome to throw in an initial spec for this based on the steps :) 14:42:23 ajo_, cool! let's see what we can do! 14:42:32 ok 14:43:23 #topic other RFEs 14:43:24 https://bugs.launchpad.net/neutron/+bugs?field.tag=qos+rfe+&field.tags_combinator=ALL 14:43:36 we had this one too: 14:43:37 https://bugs.launchpad.net/neutron/+bug/1614728 14:43:39 Launchpad bug 1614728 in neutron "REF: qos: rule list in policy is too difficult to use" [Undecided,Won't fix] 14:44:10 which I have moved to won't fix by now, since we can't make changes to the API, unless we get microversioning (one day) 14:44:41 I'm moving this other one to Won't fix for the same reason: 14:44:41 https://bugs.launchpad.net/neutron/+bug/1580149 14:44:43 Launchpad bug 1580149 in neutron "[RFE] Rename API options related to QoS bandwidth limit rule" [Wishlist,Incomplete] - Assigned to Slawek Kaplonski (slaweq) 14:46:02 and, there's ECN which needs more maturing 14:46:23 I've heard non formal requests to provide pps limits, right rnoriega ? 14:46:44 ajo_, yep 14:47:07 and also requests about, providing warnings on bandwidth usage 14:47:16 ajo_, right too 14:47:21 but that's more likely to fit in https://bugs.launchpad.net/neutron/+bug/1592918 14:47:22 Launchpad bug 1592918 in neutron "[RFE] Adding Port Statistics to Neutron Metering Agent" [Wishlist,In progress] - Assigned to Sana Khan (sana.khan) 14:47:36 we can look at it when we don't have so much on our shoulders 14:47:59 ajo_, agreed. Thanks 14:48:08 #topic bugs 14:48:16 #link http://bit.ly/1WhXlzm BUGS 14:48:23 let's see if we have anything there to be tackled 14:48:48 #link https://bugs.launchpad.net/neutron/+bug/1627749 qos driver api can have better error handling 14:48:48 Launchpad bug 1627749 in neutron "qos driver api can have better error handling" [Medium,Confirmed] 14:48:55 I agree, this one has been filled by yamamoto 14:49:09 basically, we could have other backends, like midonet in this case 14:49:26 which could fail when we ask to modify a policy 14:49:34 and we need to handle that properly 14:49:39 now the opreation is just stopped 14:51:23 #link https://bugs.launchpad.net/python-neutronclient/+bug/1587291 14:51:26 Launchpad bug 1587291 in python-neutronclient "Specifying '-F' or '--field' parameter in the qos related commands, returns abnormal result" [Low,In progress] - Assigned to Yan Songming (songmingyan) 14:51:28 this one is in progress, probably needs reviews 14:51:45 #link https://review.openstack.org/#/c/326902/ 14:51:48 ohh, it's merged 14:53:08 #link https://bugs.launchpad.net/neutron/+bug/1625570 testing DSCP in fullstack via packet inspection 14:53:09 Launchpad bug 1625570 in neutron "fullstack : should add test of ensure traffic is using DSCP marks outbound " [Wishlist,New] 14:53:27 that one is wishlist, but important too, because now we only make sure rules are set properly, etc, 14:53:36 but we never check the outgoing traffic for DSCP marks 14:53:40 so in that case we're really testing ovs flows 14:53:42 njohnston, ^:) 14:53:56 not testing the neutron code 14:53:58 yes, we have some proposals to use pcap / tcpdump to check the real packets coming out the VM 14:54:12 yes, but since those are our flows, we may want to test them 14:54:16 they can become broken 14:54:32 (incompatibilities to ovs-fw, or the new L2 flow pipeline (eventually)) 14:54:40 the intent is... avoiding the feature from being silently broken 14:54:50 Understood. Is there any other packet capture code in fullstack? 14:55:04 njohnston, we have some experiments with tcpdump 14:55:15 and I discussed some options with jlibosva but not sure what's the status 14:55:19 Sounds fun! And it frightens me to my core. 14:55:29 nah, it shall just be fun 14:55:33 1) setup rules 14:55:49 2) ssh to the VM to send packets to tempest controller 14:55:52 sorry 14:55:56 not even that, no VM :) 14:56:06 just send packets from the first port to the 2nd 14:56:10 (this is fullstack, no tempest 14:56:20 3) capture packets on the other side with a filter for the dscp flag 14:56:24 no packets -> rules don't work 14:56:28 packets -> rule works 14:57:47 and 14:58:04 I guess we can wrap up the meeting for today, .... wasn't it going to be short? ':] 14:58:11 :-D 14:58:21 :D 14:58:38 njohnston, ltomasbo , davidsha , njohnston , see u around 14:58:44 thanks ajo_ 14:58:46 Thanks! see ya! 14:58:54 thank you sirs! 14:58:56 Thanks! See you! 14:59:00 #endmeeting