14:01:01 #startmeeting neutron_drivers 14:01:02 Meeting started Fri Oct 18 14:01:01 2019 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 14:01:03 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 14:01:04 hi 14:01:05 The meeting name has been set to 'neutron_drivers' 14:01:09 o/ 14:01:11 o/ 14:02:14 lets wait few more minutes for haleyb and yamamoto to have a quorum 14:02:17 hi 14:02:42 hi 14:04:27 amotoki will not be there today, we are still missing yamamoto but I think we can start as we already have quorum 14:04:37 #topic RFEs 14:04:54 first on the list for today is: 14:04:56 https://bugs.launchpad.net/neutron/+bug/1843924 14:04:56 Launchpad bug 1843924 in neutron "[RFE] Create optional bulk resource_extend" [Wishlist,Triaged] 14:05:02 hi 14:05:15 proposed by njohnston 14:06:22 Hi! This spec is oriented towards solving one of the two areas of bulk operations that are hard to optimize (the other being IPAM). 14:07:11 There is significant time being spent in resource_extend, and at present there is no way to pass resources to extenders in bulk form. 14:08:23 My main concern that led me to submit this as an RFE is that this be graceful to use for items that leverage resource_extend, especially because I don't know if any out of tree extensions register with resource_extend. I would not be surprised if they do. 14:12:20 I don't have a clear idea of what the performance gains will be, but I do think they will be non-trivial. And this benefits more than just the bulk port optimization work - if any other bulk operations will be optimized in future this can be leveraged there as well. 14:14:23 Yeah... This is an area we can explore with the work I've been doing with code profiling 14:14:56 This weekend I will push the final version of that tooling, per the last meeting of the performance team 14:15:44 so we can start exploring how much improvement comes out of this RFE 14:16:32 njohnston: so with this profiling tool from mlavalle You should be able to measure actually how much time it spends now on resource_extend operations and than we can know better how (if) much we can win there 14:16:58 slaweq: Correct, which I think is a natural pre-requisite for this. 14:18:28 so njohnston do You think we should get back to this rfe when You will do some profiling already? 14:19:11 why not approve the RFE and require a PoC 14:19:27 I am fine with either option 14:19:32 where njohnston writes the PoC and I help with the profiling 14:19:57 mlavalle: that would work for me too 14:20:09 haleyb: yamamoto: any opinions? 14:20:34 fine with either ways 14:20:52 i think it would be great to get this done, as we know more users are using bulk ops 14:21:15 ok, so lets approve rfe and I will add comment about PoC and profiling it 14:22:35 profiling before and after 14:23:14 mlavalle: correct 14:23:18 The POC will start disabled, so it will be functionally the same as master... then it should be a one-liner to enable it for profiling 14:23:26 to make it easy 14:23:30 yeap 14:24:09 ok, lets move on than 14:24:13 next rfe 14:24:15 https://bugs.launchpad.net/neutron/+bug/1825345 14:24:15 Launchpad bug 1825345 in neutron "[RFE] admin-state-down doesn't evacuate bindings in the dhcp_agent_id column" [Wishlist,Confirmed] 14:24:30 this one was discussed some time ago already 14:24:37 recently I talked with zigo about it on irc 14:24:57 and I wrote summary of pros and cons of 2 possible solutions in my last comment 14:25:44 so I wanted to discuss here which option is in Your opinion better (or maybe there is some 3rd one?) and what we will finally do with this rfe 14:29:29 with client side implementation, we have the drawback of a slower evacuation, but this is also a pros (see server side cons: possible congestion) 14:29:33 Am I correct? 14:29:53 ralonsoh: yes, that is correct IMO 14:30:13 so +1 to client (easier, no API, no server change) 14:31:18 it reminds me a little of the StarlingX proposal for automatic re-balancing, but not exactly 14:31:39 I had the same thought haleyb 14:32:06 yeah, but in this case the rebalancing is manual 14:32:20 it's in control of the admin 14:32:21 that was a little different, i think it might be Ok to do this in the server, maybe just in two steps - --disable, then --evacuate 14:32:45 mlavalle: agreed, automatic is not preferred by me 14:33:53 slaweq: in some cases, for example when dhcp_agents_per_network (or whatever that is) == number of controllers, evacuating an agent might do nothing, since the network is already scheduled on the others 14:33:56 haleyb: I agree, this should be done only on admin's request 14:34:29 we still have the issue of un-balanced when the agent is brought back 14:35:05 haleyb: but that's IMO different problem than this rfe is trying to address 14:35:27 slaweq: right, just an obvious comment for another RFE :) 14:35:32 :) 14:36:17 how would the --evacuate option look like at the rest api level? 14:37:22 yamamoto: I guess it should be new API call, something like /agents//evacuate 14:37:31 but that's only my assumption now 14:39:05 but maybe yamamoto is asking more about the mechanics of how it would work 14:40:17 mlavalle: so that I don't know, it's zigo's proposal 14:40:47 I wonder if it's like get-me-a-network, it's a rest call that just orchestrates a few other operations under the hood 14:41:04 most likely 14:41:41 njohnston: probably it would be like that 14:42:19 is there any precedent where we have this kind of automation in the client? 14:44:29 IIUC there are some plans to add project deletion (like ospurge) into openstack sdk 14:44:44 but it's for sure not implemented there yet 14:45:27 and that is better as a client based thing because it spans multiple services 14:49:18 njohnston: I agree 14:50:36 nova have evacute of vms from host, right? 14:50:44 yes 14:50:46 and it's done on server side, is that correct? 14:50:52 yes 14:52:06 do You know about other projects who provides such option for some resources maybe? 14:52:07 https://docs.openstack.org/api-ref/compute/?expanded=evacuate-server-evacuate-action-detail 14:52:33 it is an action you perform on a server 14:53:05 so maybe we should do it also on server side to be kind of "consistent" with e.g. nova? 14:53:57 and the 'host' parameter that goes in the request body is optional 14:54:11 if not specified, the scheduler picks a destination host 14:54:53 ahh, but it's to evacuate "VM" 14:55:10 so in our case it would be "evacuate network X from agent Y" 14:55:50 yeah 14:56:02 but it's a good parallel 14:56:15 but for that we have already API 14:56:18 most likely that's where zigo got the idea from 14:56:43 https://docs.openstack.org/api-ref/network/v2/?expanded=schedule-a-network-to-a-dhcp-agent-detail#agents 14:57:06 what we are potentially missing is something like "evacuate all networks from agent Y" 14:57:32 why not clarify that in the RFE? 14:57:34 so it would be something a but different than what "nova's evacuate" is 14:57:55 ok, I will ask to clarify that in rfe 14:58:03 and we will get back to this one once again 14:58:07 ok for You? 14:58:54 +1 14:58:59 if you read the initial description in the RFE, he proposes: openstack network agent evacuate e865d619-b122-4234-aebb-3f5c24df1c8e 14:59:19 which is menas network by network 14:59:39 which means network by network 14:59:58 mlavalle: do You think that "e865d619-b122-4234-aebb-3f5c24df1c8e" is network id in this example? 15:00:00 in other words, it is a parallel to the Nova evacuate 15:00:01 or agent id? 15:00:12 IMO it's agent id 15:00:13 I think it's an agent id 15:00:23 you are right, it is agent id 15:00:26 (time is up) 15:00:30 ok, have to finish now 15:00:34 thx for attending 15:00:38 o/ 15:00:38 and have a great weekend 15:00:40 o/ 15:00:43 #endmeeting