14:03:57 #startmeeting neutron_qos 14:03:57 Meeting started Wed Sep 21 14:03:57 2016 UTC and is due to finish in 60 minutes. The chair is njohnston. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:03:57 #chair ajo 14:03:58 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:04:00 The meeting name has been set to 'neutron_qos' 14:04:01 Current chairs: ajo njohnston 14:04:06 Since we are all probably spending a lot of time testing Newton, I'd just like to focus on things that have velocity at the moment, and probably keep this meeting short. 14:04:07 you were not late ralonsoh , see? :D 14:04:21 ajo: Thanks! 14:04:26 thanks njohnston ! : 14:04:28 :) 14:04:32 sorry about that 14:05:10 sorry njohnston go ahead with the plan, leave me a slot for the nova-placement-api details & ihar comments about prioritizing validation :) 14:05:10 #topic RFEs 14:05:10 #link https://review.openstack.org/#/c/318531/ 14:05:38 #link https://review.openstack.org/#/c/318531/ 14:05:38 [WIP] Add QoS minimum egress bandwidth rule into ovs-agent: 14:05:38 - Use ralonsoh implementation: use QoS on all br-int ports and external ports. 14:05:40 - Use Ichihara implementation: reduced scope, only apply QoS on external ports. 14:05:42 I'm in the middle of a discussion with Hirofumi 14:05:57 I you can review that, that could be helpful 14:06:20 do we have two implementations ? 14:06:26 Yes 14:06:30 oh ':D 14:06:40 patch 10: my imp,ementation 14:06:48 with full qos applied to all ports 14:06:58 patch 13: qos only applied to external ports 14:07:25 It's a matter of define the scope 14:07:54 No more comments 14:07:57 Would anything block us by starting on external 14:08:04 and then adding internal too if we find it reasonable? 14:08:12 Perfect! 14:08:34 I mean, if we hear of a use case for internal too that requires it, then we add it 14:08:35 I'll talk to Hirofumi to take care of this patch 14:08:45 and otherwise, we assume internal traffic is not constrained 14:08:58 ajo: I agree with this scope 14:09:01 as generally doesn't have the limitation of the external uplink, but CPU is limited 14:09:06 ralonsoh++ 14:10:10 #topic Bugs 14:10:10 #link https://review.openstack.org/#/c/367369/ 14:10:10 Fix SR-IOV qos extension calls to clear_rate functions: 14:10:41 Yes. That's a small bug 14:10:50 I'm asking for reviews 14:10:57 only this 14:11:17 ralonsoh, is it a bug, or enhancement to design? 14:11:32 I mean (to consider if we need to push it to the next RC) 14:11:52 enhancement 14:12:14 ack :) 14:12:14 But we need to change the name of the function calls 14:12:18 Thanks 14:12:29 if this waits until Ocata, will we need to drop a deprecation warning for the old function call names? 14:13:13 We will need to check ip-link has these features 14:13:20 No need for deprecation 14:13:27 But I'll review that 14:13:41 very good 14:13:46 #link https://bugs.launchpad.net/bgpvpn/+bug/1622644 14:13:47 Launchpad bug 1622644 in networking-bgpvpn "OVS agent ryu/native implementation breaks non-OF1.3 uses" [Critical,Confirmed] 14:13:47 The change related to this is: "fullstack: execute qos tests for all ovsdb/of interface permutations" 14:13:47 #link https://review.openstack.org/372553 14:13:47 it's internal API 14:13:53 so I guess no need for deprecation 14:14:08 I thought people should be aware of this, because it's good stuff 14:14:31 yeah :D 14:14:53 tiny change, full test coverage 14:14:56 ihar++ :) 14:15:27 #topic Other Changes 14:15:33 ralonsoh: How goes the OSC patches? 14:15:47 I saw a new PS on at least one recently 14:15:51 waiting for qos policy 14:15:56 the first one 14:16:10 only one +2 14:16:24 and a comment: "waiting for dtroyer comments" 14:16:43 ralonsoh, do we have support to attach policy to port or network ? 14:16:46 perhaps it would help to ping in #openstack-sdks 14:17:05 +ping :) 14:17:09 ajo: what do you mean 14:17:14 ? 14:17:19 like the neutron port-update --qos-policy my-policy 14:17:26 or neutron port-update --no-policy 14:17:29 same for net 14:17:39 ajo: hmmmm I need to check this 14:17:48 ok, new bug for OSC 14:17:49 neutron client provides that capability, but not sure if we already had patches for that, ok 14:17:53 X') 14:18:03 I just realized we were missing that tiny bit :D 14:18:05 I'll take care of this 14:18:18 you're awesome ralonsoh 14:18:28 thanks 14:18:36 ralonsoh += awesome 14:18:54 ajo: "leave me a slot for the nova-placement-api details & ihar comments about prioritizing validation". You have the floor! 14:19:01 oooook :) 14:19:18 #topic minimum bandwidth egress (strict) RFE 14:19:32 #link https://bugs.launchpad.net/neutron/+bug/1578989 14:19:33 Launchpad bug 1578989 in neutron "[RFE] Strict minimum bandwidth support (egress)" [Wishlist,Confirmed] 14:19:55 this is something we expect to span from Ocata to beyond, requires no more API changes, 14:20:11 unless we want to flag the min bw rules for strict or not 14:20:20 that's something we could perhaps discuss 14:20:23 but 14:20:30 this is the feature that requires integration with "nova scheduler" 14:20:46 I got a great update today from sylvainb (Sylvain Bauzas) 14:20:49 from the nova team 14:20:57 I've been working with nova scheduler 14:21:08 The generic resource pools spec materialized as a new API endpoint 14:21:12 the nova placement API 14:21:13 I know how to do this and I made a POC in my company a year ago 14:21:35 ralonsoh, we can't hook up a plugin on the scheduler for this, but we have a plan 14:21:36 :) 14:21:42 so 14:21:48 nova came up with this spec: 14:21:50 #link http://specs.openstack.org/openstack/nova-specs/specs/newton/approved/generic-resource-pools.html 14:21:58 it's implemented for newton, the API is available 14:22:06 perfect 14:22:22 it will let you to declare (via API) resource pools ( DISK_GB, RAM, etc...) and eventually 14:22:22 ajo: very nice! 14:22:30 NIC_BW__ 14:22:48 and then we would expose those requirements on the neutron ports when got by nova 14:23:02 so before scheduling the instance, nova would have to create (or get) the port, check the requirements 14:23:14 and find a hypervisor satisfying the needed bandwidth 14:23:39 this is the devstack support to enable such API: 14:23:41 #link devstack support: https://review.openstack.org/#/c/342362/ 14:24:08 so far, the API is not ready to declare custom resource classes 14:24:22 (that means different types of NIC_BW (per physnet and direction)) 14:24:39 wip spec on nova for custom resource classes: 14:24:42 #link https://review.openstack.org/#/c/312696/ 14:24:50 and 14:24:58 example of resource reporting from nova to such API: 14:25:00 #link https://github.com/openstack/nova/blob/master/nova/scheduler/client/report.py 14:25:03 they do it via http 14:25:14 so 14:25:17 we would need to 14:25:29 1) Collect available bandwidth on each agent, per physnet 14:25:46 2) Report it via this API, or to a central service in neutron, that will in turn report to the nova api 14:25:57 that's the total amount of bw 14:26:16 3) we would need the custom resource classes to be there 14:26:38 or we could tweak it in the mean time by having a "non-merged" patch in nova to have a generic NIC_BW or a bunch of them, just for testing 14:26:41 until they merge it 14:26:48 Depends-On, etc... 14:26:52 yep 14:27:01 4) exposing those constraints on a port-get from nova 14:27:25 5) making nova create/get the port before trying to schedule the port (apparently they will implement that on Ocata for Cells v2) 14:27:51 6) and make sure they use/count those details 14:27:57 probably that's it :) 14:28:14 ah 14:28:17 one point 14:28:22 good analysis! 14:28:25 if you see their scheduler/client/report.py 14:28:33 you will see that there is no integration of such api in the nova client 14:28:38 there's no client still for the nova placement api 14:29:05 A possible plan could be using their code (copying to neutron), and once there's a common client, making use of it 14:29:16 a common client, likely to be "support in the openstack sdk" :-9 14:29:18 :-) 14:29:36 So, this is a effort to probably span Ocata -> beyond 14:30:04 this is some awesome news 14:30:13 but we should consider start making progress on it and not waiting for the whole thing (custom resource types in nova) when we see it fit 14:30:36 I will start myself by exploring the nova API, and writting a more detailed plan 14:30:55 may be a full spec? since this seems to be a complicated chunk 14:31:17 #action ajo starts writing a spec 14:31:20 Yes, I think this is probably good for a spec 14:31:21 #undo 14:31:22 Removing item from minutes: 14:31:41 #action ajo starts writing a spec for strict bandwidth guarantees and nova placement api integration 14:31:55 any comments about this topic? 14:32:17 I'm very interested in help with that 14:32:50 super cool 14:32:50 ralonsoh, great, let's talk during the summit (probably, with more details on hand), I'll loop you in the spec 14:33:04 so, last topic then? 14:33:15 yep 14:33:16 #topic Open Discussion 14:33:21 anything else? 14:33:24 so, we have 14:33:31 hi guys 14:33:34 the enhanced rule validation that slaweq was working on 14:33:54 this has to go before any new kind of new policy rule we introduce 14:34:01 to avoid things getting out of hand :) 14:34:21 and, also I need to refactor the notification driver thing into a "qos driver" thing, to be more consistent 14:34:26 yep; I don't recall seeing anything new on that recently, but I think it is an early-Ocata must-have 14:34:47 so I should resume my chunk of work ':) 14:34:49 sorry to interrupt, : ( 14:34:51 that's all on my side 14:34:58 liuyulong__: go ahead 14:34:59 liuyulong_, go ahead, sorry :) 14:35:05 I have a draft RFE spec, https://github.com/gotostack/gotostack.github.io/blob/master/pages/CloudComputing/layer-3-rate-limit.rst. It’s something about L3 qos. Hope to get some help, if you guys have time. I'm always in the openstack-neutron channel. Thank you. : ) 14:35:27 #link https://bugs.launchpad.net/neutron/+bug/1596611 14:35:30 Launchpad bug 1596611 in neutron "[RFE] Create floating-ips with qos" [Wishlist,Triaged] - Assigned to LiuYong (liu-yong8) 14:36:20 liuyulong__, question 14:36:29 isn't that equivalent to limiting the external leg of the router ? 14:36:37 I only readed it diagonally, so probably not 14:36:57 ah, per floating ip? 14:36:59 ajo, yep, east/west will not 14:37:18 ajo, only for public IPs (v4) 14:37:37 yes, but east/west doesn't go through the external leg of the router 14:38:03 in the current implementation you could acomplish that (for egress only) if you put a bandwidth limit policy on the external leg of the router 14:38:08 ajo, yes, east/west will be handled by current QoS implementation. 14:38:09 but not by IP, only globally 14:38:46 I think it's better if we don't spread the qos details across objects/and stuff (like floating ips) 14:38:52 several alternatives could be 14:39:05 1) Attaching policies to floating IPs ? 14:39:29 2) Having traffic classification (you can filter on the IP for the limit) and attach policy to external router leg 14:39:35 and I'm not sure how this would vary based on whether there is fast-exit routing versus return path routing through the network node 14:40:08 ajo, actually we've already implement this by using linux tc in router namespace. 14:40:26 liuyulong__, yes, It's completely possible 14:40:46 ajo, locally in our mitaka neutron, : ) 14:40:48 liuyulong___ but the complicated part is not implementing an API and the code (it has it's merit of course) 14:41:00 the complicated part is agreeing an API with the community 14:41:13 then what, when somebody wants to apply DSCP rules per floating IP ? 14:41:18 or min bw rules per floating IP ? 14:41:21 ajo, yep, so I'm here to find some help/advice and so on. 14:41:36 may be the model is to let floating IPs being associated to QoS policies 14:41:39 but that'd mean 14:41:49 that we have some basic mean of doing traffic classification 14:41:55 because from an ovs agent point of view 14:42:15 we'd have to apply it on port level 14:42:17 ajo, people may want to know how large the bandwidth of their floating IP. 14:42:21 it's worth thinking about it 14:42:27 seems like a reasonable use case 14:42:57 liuyulong__, they could see the policy_id on the floating ip 14:43:00 and then check the policy 14:43:29 ajo, and cloud admins also want limit the users' floating IP bandwidth, because of the limitation of the NIC or DC export. 14:43:48 yes yes, liuyulong__ I'm not discussing the use case 14:43:52 I believe it's valuable 14:44:01 should we continue this discussion on the bug? 14:44:20 we should continue discussion, 14:44:34 on the bug, I mean 14:44:43 ajo, So, should I submit the draft spec to review? We can talk about it there? 14:45:00 liuyulong__ yeah, may be a draft spec makes sense, but please don't open blueprints, 14:45:08 that's done by the driver team once an RFE has been approved 14:45:26 liuyulong__ you may also want to discuss with slaweq 14:45:31 ajo, OK, I have not. 14:45:35 he works at OVH, has been working on the current APIs, 14:45:44 and he's probably interested in your use case too 14:46:08 ajo, great, thanks. 14:46:44 super 14:46:52 does anyone have anything else? 14:47:41 ok, hearing nothing else I'll give 12 minutes back. Thanks! 14:47:50 njohnston++ thank you very much 14:47:55 #endmeeting