14:01:07 #startmeeting neutron_qos 14:01:08 Meeting started Wed Apr 20 14:01:07 2016 UTC and is due to finish in 60 minutes. The chair is ajo. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:01:09 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:01:13 The meeting name has been set to 'neutron_qos' 14:02:19 Participating from my car today 14:02:44 njohnston: and You are driving now? :) 14:02:51 hi njohnston 14:02:51 nope 14:02:52 :-) 14:03:16 I was letting some extra time for people to join. 14:03:21 hi 14:03:25 but it seems that the attendance is low :-), 14:03:49 hi 14:03:51 njohnston, prepared the agenda for today, as he's stepped in to help me co-chair this meeting 14:03:59 and today, he's travelling. 14:04:07 thanks njohnston 14:04:23 hi RP_ , slaweq_ , irenab , njohnston , so let's begin 14:04:42 #topic Summit Planning 14:04:58 We have two QoS talks on the schedule, for those interested: 14:05:09 #link https://www.openstack.org/summit/austin-2016/summit-schedule/events/7495 14:05:48 a general one about Quality of Service, what we accomplished in Mitaka, and the future roadmap, ( slaweq_ , vhoward and me) 14:05:50 #link https://www.openstack.org/summit/austin-2016/summit-schedule/events/7441?goback=1 14:06:10 and exactly thanks davidsha 14:06:23 an specific one about DSCP 14:06:52 njohnston, amd davidsha will be on stage :) 14:07:26 We'd also like to announce, if that sounds right, that we'd be cancelling the meeting after the summit, on May 4th. 14:07:29 Yay! 14:07:34 hehe njohnston 14:07:44 +1 14:07:46 +1 14:07:49 any objections for cancelling the next meeting? :) 14:07:52 +1 from me too 14:07:52 davidsha: haha 14:08:19 :-) 14:09:08 Also, we should use the friday time to meet, discuss and plan about QoS in person. 14:09:25 some of the friday time, :) 14:09:52 ajo: there is also the flow management/ flow classifier meeting on Tuesday with Cathy I believe. 14:10:00 +1 14:10:13 oh, great davidsha , what time's that on Tuesday exactly? 14:10:16 davidsha: any specific time? 14:10:26 ajo: can it be friday morning, because I have plane at afternoon? 14:10:30 1 sec 14:10:38 It's an interesting morning. 14:10:53 slaweq_ +1 for the morning, most people will be flying on the afternoon I suspect 14:11:00 ok, great :) 14:11:11 No time "I am thinking about around lunch time on Tuesday or Wednesday since some of us will fly back on Friday morning/noon." 14:11:19 I booked my flight back on saturday, because the latest flight was taking off at 18:05 for me 14:11:24 i'm in for friday 14:11:32 -_- lunch time. 14:11:33 I will go this at 18:05 :) 14:12:01 davidsha, may be we should bring a big sign for lunch, it's hard to meet anyone on purpose during summit on lunch 14:12:11 I'll be around on Friday. 14:12:39 davidsha, or make the meeting of Tuesday more specific as we see the place on Monday, otherwise I'm afraid the meeting won't really happen 14:13:18 ok, let's follow up about that with Cathy on that 14:13:31 ajo: can you please share some details regarding scheduler and neutron discussion you emailed few mins ago? 14:13:35 and let's talk QoS on friday morning. 14:13:44 irenab, sure, there's a point for it. 14:13:57 let's follow njohnston's prepared agenda. :) 14:14:18 https://etherpad.openstack.org/p/neutron_qos_meeting_chair if you want to sneak peak 14:14:33 #link https://bugs.launchpad.net/neutron/+bug/1563720 14:14:33 Launchpad bug 1563720 in neutron "Bandwidth limiting for linuxbridge agent works for ingress traffic instead of egress" [High,Fix released] - Assigned to Slawek Kaplonski (slaweq) 14:14:58 slaweq_, how's that looking ? 14:15:05 it's fixed 14:15:24 we changed tc to use ingress policing instead of shaping egress traffic 14:15:26 oh, true, it was merged, /me feels dumb :D 14:15:35 good work slaweq_ , 14:15:41 in fact it's now in exactly same way like ovs is doing it 14:15:51 there is also cherry-pick to mitaka of this patch 14:15:55 waiting for review 14:16:00 in the context of that, slaweq_ found that ovs ingress policing burst parameter set's the wrong thing 14:16:17 if you ask openvswitch to set 1Mbit burst on a port for ingress policing 14:16:22 you get 8Mbit burst 14:16:28 (bits mixed with bytes) 14:16:38 yep, and ajo fixed it in ovs :) 14:16:45 I've proposed a fix for ovs, 14:17:01 but the issue with that, is that next versions of ovs will behave differently to current ones, 14:17:08 we may need to add a flag to the agent, and a sanity check 14:17:19 to do the /8 division 14:17:30 ok 14:17:31 # 14:17:32 #link https://launchpad.net/bugs/1486607 14:17:33 Launchpad bug 1486607 in neutron "tenants seem like they were able to detach admin enforced QoS policies from ports or networks" [Medium,In progress] - Assigned to Nate Johnston (nate-johnston) 14:17:36 maybe we should do in ovs something like in LB agent? 14:17:50 to add burst value as 80% of bw_limit if it is not set? 14:17:59 slaweq_, I agree, that's right 14:18:23 AFAIR ovs sets always 1000kb if burst is not given 14:18:30 slaweq_, could you open a bug for it so we don't forget ? 14:18:35 but we could make it more consistent for both 14:18:41 ok, I will 14:18:57 yes, I'd set 80% of max_rate by default, of nothing provided 14:19:14 thanks slaweq_ 14:19:22 about the above bug ^ 14:19:33 njohnston, is on it, he's sorting out the testing details 14:19:37 but other than that it should be good. 14:19:57 Thanks njohnston!, anything to be noted about it? 14:20:42 ok, next one :) 14:20:44 #link https://bugs.launchpad.net/neutron/+bug/1568400 14:20:46 Launchpad bug 1568400 in neutron "Pecan does not route to QoS extension" [Medium,In progress] - Assigned to Brandon Logan (brandon-logan) 14:20:54 Just that I need to sort out how to take admin and non-admin actions in the same test 14:21:25 njohnston, ack, exactly, the testing details, he needs to be able to do operations in the api-tempest tests as admin, and normal tenant 14:21:41 if anybody has experience with that ping njohnston, please :) 14:22:01 about 1568400 blogan fixed that, it's an issue in the context of pecan, so thanks blogan! :) 14:22:32 I will skip " https://bugs.launchpad.net/neutron/+bug/1507654" since ihar doesn't seem around, 14:22:34 Launchpad bug 1507654 in neutron "Use VersionedObjectSerializer for RPC push/pull interfaces" [Low,In progress] - Assigned to Artur Korzeniewski (artur-korzeniewski) 14:23:13 that seems to be "In progress" but there's no link to the actual patch 14:23:34 #action iharchys link https://bugs.launchpad.net/neutron/+bug/1507654 to the gerrit review 14:23:35 Launchpad bug 1507654 in neutron "Use VersionedObjectSerializer for RPC push/pull interfaces" [Low,In progress] - Assigned to Artur Korzeniewski (artur-korzeniewski) 14:23:49 #link https://bugs.launchpad.net/neutron/+bug/1507761 14:23:50 Launchpad bug 1507761 in neutron "qos wrong units in max-burst-kbps option (per-second is wrong)" [Low,In progress] - Assigned to Slawek Kaplonski (slaweq) 14:23:53 slaweq_, this one is yours ^ 14:23:56 yep 14:24:07 as far as I understood, we can't proceed until we have something like api microversioning 14:24:10 and I wanted to ask about what to do with it? 14:24:16 exactly 14:24:20 slaweq_, probably document and wait 14:24:24 as You and ihrachys said in review 14:24:34 document on launchpad? or where? 14:24:55 slaweq_, the api guide, probably, and the adv networking guide 14:25:00 ah, ok 14:25:12 no problem, I have some experience with that already :) 14:25:17 slaweq_++ 14:25:21 awesome, thanks :) 14:25:26 +1 14:25:46 #link https://bugs.launchpad.net/neutron/+bug/1515564 14:25:47 Launchpad bug 1515564 in neutron "Internal server error when running qos-bandwidth-limit-rule-update as a tenant" [Low,In progress] - Assigned to Liyingjun (liyingjun) 14:26:38 #link https://review.openstack.org/#/c/244680/ needs reviews 14:27:01 I totally missed it, adding it to my queue 14:27:59 I suspect that patch will conflict with the qos_plugin refactor mfrances was working on 14:28:20 mfranc213, if you can ask the author to follow up on your patch, that would be the best I think 14:28:52 which is: https://review.openstack.org/#/c/294463/ 14:29:40 next is 14:29:41 #link https://bugs.launchpad.net/neutron/+bug/1558614 14:29:42 Launchpad bug 1558614 in neutron "The QoS notification_driver is just a service_provider, and we should look into moving to that" [Low,In progress] - Assigned to Ching Sun (ching-sun) 14:30:20 ajo: i'll get in touch with the author. thanks for this. 14:30:20 I see we have a question to answer, I still don't have the exact answer 14:30:28 it's not a simple change of config name, 14:30:52 we may need to look into the service_providers thing, see how it works, and if it fits what we do with the notifications_driver 14:30:56 but it seems very similar. 14:31:08 May be it does not 100% fit 14:31:24 because with service providers, you can specify different providers for the same service 14:31:29 and that won't work in our case, 14:31:56 I will link that bug to this log, and discuss in there as soon as we have a clear answer 14:32:12 #link https://bugs.launchpad.net/neutron/+bug/1507656 14:32:13 Launchpad bug 1507656 in neutron "RPC callbacks push/pull mechanism should be remotable" [Wishlist,Confirmed] - Assigned to Ihar Hrachyshka (ihar-hrachyshka) 14:32:56 I wasn't sure if that was large enough to warrant an RFE 14:33:05 that one doesn't really add any new feature 14:33:11 is more like a refactor of the rpc callbacks code 14:33:15 Ok 14:33:40 I think push can't use what ihrachys proposes, but I'm sure pull could 14:34:05 we may eventually look at it when we have time. or move to Won't fix if that doesn't happen. 14:34:23 #link https://bugs.launchpad.net/neutron/+bug/1556836 14:34:24 Launchpad bug 1556836 in neutron "QOS: TestQosPlugin should receive plugin name as an input" [Wishlist,In progress] - Assigned to Adit Sarfaty (asarfaty) 14:34:59 ah 14:35:03 that doesn't apply anymore 14:35:15 because they did it the right way and stopped subclassing the QoS plugin 14:35:53 bugs-- 14:35:56 that feels good :) 14:36:07 #link https://bugs.launchpad.net/neutron/+bug/1560963 14:36:08 Launchpad bug 1560963 in neutron "[RFE] Minimum bandwidth support (egress)" [Wishlist,Confirmed] 14:36:11 irenab, you were asking about this one 14:36:26 ajo: I was patiently waiting 14:36:33 you were very patient :) 14:36:38 There's an ongoing design in the nova scheduler about this 14:36:59 The idea is that they made some generic entities that we can fill in, to let the scheduler account resources 14:37:00 in our case 14:37:14 we could create pools (of traffic) attached to the hosts, 14:37:27 (1 compute) --- (1 bandwidth pool) 14:37:54 and then, when nova had to schedule an instance with a port that requires "NIC_BW" of 1Gbit 14:38:07 it would look for the right pool (and therefore the right compute) to host it 14:38:25 and it would also count down the used traffic from the bandwidth pool attached to such compute 14:38:38 on that discussion 14:38:57 we noticed that, for that to happen, nova may need to look at the qos_policy_id and make an interpretation of a policy 14:39:07 looking for guaranteed traffic rules in the policy 14:39:26 which is too coupled, since nova would have to understand the policies, the rules, which are supposed to be evolving 14:39:27 so.. 14:39:57 there was suggestion from jaypipes to add an API to translate port uuids (or other objects), to scheduler constraints/scheduler resource usages 14:40:22 that's a good idea IMHO, 14:40:37 nova API? 14:40:43 no, neutron api 14:41:06 you would give that API a port id, and you would get a set of constraints/or resource usages for the scheduler 14:41:23 nica 14:41:25 nise 14:41:29 ./get-scheduling-details/port/ -> 14:41:29 nice :-) 14:41:41 {"NIC_BW": 10000000, blah, blah: ..} 14:42:00 the only issue we found about that, is that it would add another call from nova to neutron 14:42:01 * njohnston has to go now... I will catch up with the transcript later. Thanks ajo! 14:42:06 irenab: an API that Nova (or the broken out scheduler) can call to Neutron to get information about the networking-related resource providers (like IPv4 address, etc) 14:42:07 thanks njohnston ! 14:42:24 hi jaypipes :) 14:42:29 ajo: can't it be given in port-show call? 14:42:36 irenab, jaypipes , 14:42:37 like now is policy-id 14:42:43 exactly, what slaweq_ is saying is what I was about to say 14:42:46 we could do better 14:42:54 sorry :) 14:42:57 if we provide in the resource creation or show... 14:42:59 that dictionary 14:43:06 as a read only attribute of the port 14:43:14 there is actually already port binding extension that has 2 dicts 14:43:15 that would avoid the extra API call from nova to neutron 14:43:50 irenab, aha, that's interesting, what are those dicts? 14:43:53 so maybe the vif_details or profile can be used 14:44:02 irenab, that's ml2 specific 14:44:11 we should come up with something wide for all plugins, 14:44:24 no, this is about supporting port binding extension 14:44:25 and may be move those details from ml2 back into the generalized one 14:45:01 irenab, ok, but what I mean, is that if we put those scheduling details in the port binding extension dictionary, that's ml2 specific 14:45:10 and won't be available to other non-ml2 plugins 14:45:26 so you prefer separate extension? Fair enough 14:45:46 yes, and I guess we will depend in the "core extension manager" that ihrachys and I were thinking of 14:45:49 so two RFEs: 14:45:52 * core extension manager 14:46:00 but I think the port binding is supported by others and actually maybe required by new vif lib 14:46:01 * scheduling details for nova 14:46:34 irenab, if it becomes required by vif_lib it will be an error we can't stop supporting non-ml2 plugins 14:46:48 as an example: OVN on the open side of things 14:46:52 why do you think its ml2 specifc? 14:47:13 irenab, may be it's not ml2 specific, and I'm mistaken :) 14:47:13 https://github.com/openstack/neutron/blob/master/neutron/extensions/portbindings.py 14:47:42 ajo: I agree with irenab, every plugin can support such extension 14:47:48 but its implementation details anyway :-) 14:47:59 yeah, may be it makes sense 14:48:09 let's look at it, thanks irenab :) 14:49:07 we would have to come up with something to look up for the qos_policy details of a port and synthetize the thing into another dict there 14:49:13 or another detail in the dict 14:49:36 * ajo is a little bit stubborn sometimes ;) 14:49:49 ah 14:49:51 and about this RFE 14:49:56 I should probably split it in two 14:50:02 1) best effort 14:50:07 2) strict (with nova scheduling) 14:50:20 1 would take care of the low level settings on each compute host to make it happen 14:50:27 (sriov settings, ovs settings, etc..) 14:50:50 another question about guarantee, is it for edge only or should consider the end to end fabric? 14:50:50 I'm afraid we have 10 minutes and lots of RFEs to look at yet 14:51:18 irenab, initially only for edge, we could think of wider thing later 14:51:29 fine 14:51:40 irenab, but that would require modelling the whole underlaying network, 14:51:54 and not sure if we can keep control over all the traffic flows :) 14:52:04 sounds like an interesting and very hard problem 14:52:20 I would say, it would require neutron to be responsible 14:52:44 or maybe new project will come to deal with it :-) 14:52:51 lol :) 14:53:46 I will skip some RFEs, (lack of time) 14:53:47 # 14:53:47 #link https://bugs.launchpad.net/neutron/+bug/1527671 14:53:48 Launchpad bug 1527671 in neutron "[RFE] traffic classification in Neutron QoS" [Wishlist,Incomplete] 14:54:04 davidsha, we talked about this one ^, and there's where Cathy's meeting comes, 14:54:08 luckily on Tuesday lunch 14:55:20 I'll be there 14:55:34 about https://bugs.launchpad.net/neutron/+bug/1557457 , jun wei updated his description, I need to re-read it 14:55:35 Launchpad bug 1557457 in neutron "[RFE] bandwidth rate-limit" [Wishlist,Incomplete] 14:55:40 #action ajo have another look at https://bugs.launchpad.net/neutron/+bug/1557457 14:56:18 #link https://bugs.launchpad.net/neutron/+bug/1531485 14:56:19 Launchpad bug 1531485 in neutron "Available device bandwidth report by L2 agent" [Wishlist,Expired] 14:56:53 this one is expired, but it may happen in the context of https://bugs.launchpad.net/neutron/+bug/1560963 (when reporting to the nova generic resource pools) 14:56:54 Launchpad bug 1560963 in neutron "[RFE] Minimum bandwidth support (egress)" [Wishlist,Confirmed] 14:57:22 and 14:57:30 finally slaweq_ is also working on these doc changes: https://review.openstack.org/#/c/307193 14:57:46 explaining about burst values, and how should those be configured 14:57:52 #topic Open discussion 14:58:07 anything to talk about in our remaining 2 minutes? :) 14:58:19 I'm also working on https://bugs.launchpad.net/neutron/+bug/1560961 14:58:20 Launchpad bug 1560961 in neutron "[RFE] Allow instance-ingress bandwidth limiting" [Wishlist,In progress] - Assigned to Slawek Kaplonski (slaweq) 14:58:24 but it's just begining now 14:58:56 oh, right, slaweq_ thanks about that too, you're doing an amazing job 14:59:05 thx :) 14:59:25 ok so 14:59:28 let's endmeeting 14:59:30 o/ 14:59:41 thanks everybody 14:59:41 thanks 14:59:45 thanks 14:59:47 thanks 14:59:50 see You in Austin :) 14:59:56 see you :) 14:59:58 #endmeeting