15:01:24 <ihrachys> #startmeeting neutron_upgrades 15:01:25 <openstack> Meeting started Mon Mar 6 15:01:24 2017 UTC and is due to finish in 60 minutes. The chair is ihrachys. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:27 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:27 <electrocucaracha> o/ 15:01:29 <openstack> The meeting name has been set to 'neutron_upgrades' 15:01:36 <dasanind> o/ 15:01:45 <ihrachys> hi everyone 15:01:58 <ihrachys> #link https://wiki.openstack.org/wiki/Meetings/Neutron-Upgrades-Subteam Agenda 15:02:13 <ihrachys> I updated the page ^ somewhat to reflect things we track more closely 15:02:31 <ihrachys> #topic Announcements 15:02:48 <ihrachys> Atlanta PTG happened! 15:03:17 <ihrachys> you can find a recap for upgrades topics touched at: 15:03:25 <ihrachys> #link http://lists.openstack.org/pipermail/openstack-dev/2017-March/113371.html PTG upgrades recap 15:03:49 <ihrachys> thanks to dasanind and other participants who helped to prepare the report 15:04:42 <ihrachys> it's a long read I admit, so I guess we can move on and then discuss specifics on a next meeting if needed 15:05:59 <electrocucaracha> +1 15:06:06 <ihrachys> this meeting is the first after PTG and in the new Pike cycle, so I would like to review action items so far tracked by our team, and see if anything is no longer valid and hence should not be tracked 15:06:55 <ihrachys> #topic Linuxbridge multinode grenade job 15:07:22 <manjeets> o/ 15:08:18 <ihrachys> the job is not progressing anywhere for quite some time, as can be seen at grafana dashboard 15:08:19 <ihrachys> #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate?panelId=8&fullscreen 15:08:34 <ihrachys> (it's gate-grenade-dsvm-neutron-linuxbridge-multinode-ubuntu-xenial-nv) 15:09:11 <ihrachys> I was initially tracking it, and once we were even quite close to pass it and make it voting, but then events happened 15:09:26 <ihrachys> and I haven't found time to get back to it (neither motivation) 15:09:44 <ihrachys> which makes me wonder - is anyone in the group interested to taking it over? 15:10:19 <ihrachys> I talked to kevinbenton during ptg and he said he will take it over 15:10:26 <ihrachys> or will try to 15:10:51 <ihrachys> the question is, whether anyone is interested from our side to make an effort and update the group about its progress 15:11:06 <ihrachys> if not, I will need to drop it from the agenda (and move into backlog) 15:11:36 <manjeets> ihrachys, I can help with that 15:11:49 <manjeets> but may need some guidance some times 15:12:43 <ihrachys> manjeets: ok I assume you take it over 15:13:03 <ihrachys> #action ihrachys to follow up with manjeets on next steps for linuxbridge multinode grenade 15:13:22 <ihrachys> note that ovs grenade multinode is doing well, it's just linuxbridge backend that would need some love 15:13:31 <manjeets> okay 15:13:44 <ihrachys> manjeets: thanks for taking over 15:13:56 <ihrachys> #topic Mixed server versions 15:14:41 <ihrachys> to recap for those new to the subteam, it's to support what's defined by the following tag: https://governance.openstack.org/tc/reference/tags/assert_supports-zero-downtime-upgrade.html 15:15:19 <ihrachys> korzen did some simplified local testing before and validated that Newton -> Ocata upgrade can indeed be executed in mixed mode 15:15:28 <ihrachys> which is a nice achievement already 15:15:43 <ihrachys> there are missing bits that we should close though 15:16:01 <ihrachys> one is online data migration framework 15:16:07 <manjeets> so are we close to minimal needed 15:16:35 <ihrachys> manjeets: yeah, though we will need some more bits, and gating, to get the tag 15:17:10 <ihrachys> one missing bit is a new neutron-db-manage command to enforce data migration between tables. electrocucaracha proposed one before as WIP: https://review.openstack.org/#/c/432494/ 15:17:39 <ihrachys> electrocucaracha: what's the status of it, and are you blocked on making progress there? do you expect some more reviews before proceeding with it? 15:18:02 <electrocucaracha> ihrachys: well I was expecting some comments or anything about it during the PTG 15:18:19 <electrocucaracha> ihrachys: if that's a good idea or if we can use a different approach 15:18:37 <electrocucaracha> ihrachys: our major difference against nova is that we use alembic 15:18:50 <electrocucaracha> ihrachys: which makes the things a little bit different 15:19:50 * manjeets reviewlist.enqueue( electrocucaracha 's patch) 15:19:54 <ihrachys> electrocucaracha: I think there is agreement it's the right direction, but I get you want some feedback on direction before spending more time on it 15:19:59 <electrocucaracha> ihrachys: in the other hand, I took a pointless example maybe I can work in a more realistic one 15:20:27 <ihrachys> #action everyone to review online data migration neutron-db-manage command: https://review.openstack.org/#/c/432494/ 15:20:51 <ihrachys> electrocucaracha: if you have one, sure 15:22:16 <ihrachys> another missing bit that we discussed quite extensively during PTG is to tackle differences in API behaviour between different major versions of server in a cluster 15:22:48 <ihrachys> basically, the concern is that sometimes it may be unsafe to expose new extensions via /extensions/ API before every node is on the new code 15:23:18 <ihrachys> this is especially problematic in cases where new API is used programmatically (like for new port bindings used by nova) 15:23:37 <ihrachys> in which case you can't just tell your users to avoid using new API before upgrade is complete 15:24:53 <ihrachys> so we were thinking about how to tackle that, and came up with some idea where servers will 1) report their supported extensions to others (probably in db); and 2) use that knowledge to enable/disable api extensions as needed. 15:25:18 <ihrachys> basically, falling back to the common set of extensions supported by all nodes 15:25:27 <manjeets> minimal subset b/w mixed versions 15:25:42 <ihrachys> actually maximum, though common 15:25:49 <manjeets> ok 15:26:12 <ihrachys> which in case of new port bindings would mean that nova will not see new extension supported till we are done with full upgrade, at which point it will switch to using it 15:26:29 <ihrachys> there are details to shuffle and code to write, and it's on me to write it up in specs format 15:27:16 <ihrachys> #action ihrachys to spec a mechanism to tackle differences in the list of extensions exposed by multiple mixed server nodes 15:27:45 <ihrachys> that was merely an update on that happening, and we will discuss it in detail once I have something on gerrit 15:28:25 <ihrachys> before we switch to the next topic, let's also look at how we tackle gating for the mixed servers feature 15:28:28 <manjeets> ihrachys, one question 15:28:33 <ihrachys> manjeets: sure 15:28:46 <manjeets> you said supported extensions 15:29:02 <manjeets> do servers have to register supported or active ones ? 15:29:24 <ihrachys> servers register supported but load common only 15:29:36 <ihrachys> meaning, active <= supported for each specific node 15:29:53 <ihrachys> and active lists are identical on all nodes (except some caveat that I will need to describe in the spec) 15:30:44 <ihrachys> manjeets: does it cover your question? 15:31:12 <manjeets> yes thanks, 15:33:01 <ihrachys> ok cool 15:33:08 <ihrachys> so, gating matters for mixed versions 15:33:11 <manjeets> spec will cover more details so good atm 15:33:58 <ihrachys> I think before korzen was tracking gate progress but considering that I hear Artur may leave us for some time, we may need to have another owner 15:34:24 <ihrachys> dolphm: where are we with mixed api grenade gate coverage for nova? 15:35:09 * electrocucaracha doesn't see dolphm in his desk, maybe is wfh 15:36:48 <ihrachys> ok let me follow up with korzen and dolphm after the meeting 15:37:02 <ihrachys> #action ihrachys to follow up on mixed server version gating 15:37:30 <ihrachys> any questions on the topic before we move to OVO? 15:38:03 <electrocucaracha> nope 15:38:18 <dasanind> ihrachys: will there be a new gate hi for mixed versions? 15:39:19 <dasanind> hi==job 15:39:29 <ihrachys> dasanind: yes, definitely 15:39:38 <ihrachys> dasanind: that's the promise of the governance tag 15:39:47 <ihrachys> we can't claim it until we prove in CI it works 15:39:58 <ihrachys> atm no project does claim that 15:40:08 <manjeets> ihrachys, would it be one job demonstrating rolling upgrades and mixed version stuff 15:40:12 <manjeets> or we need 2 ? 15:40:48 <ihrachys> manjeets: that's a very good question actually 15:41:07 <ihrachys> we can probably run two-node, one is all old, and second is just new neutron-server 15:41:21 <ihrachys> then we will cover both rpc compatibility as well as database rolling 15:41:32 <ihrachys> amirite? 15:41:54 <manjeets> right until they're actually tested 15:41:55 <manjeets> lol 15:43:43 <ihrachys> we will still need to have two jobs for some time 15:43:56 <ihrachys> because we can't safely roll in drastic changes into existing voting jobs 15:44:18 <ihrachys> but once we are sure it works we will be able to kill the existing one 15:44:28 <ihrachys> ok let's move to OVO 15:44:31 <ihrachys> #topic Object implementation 15:44:44 <ndahiwade> ihrachys, https://review.openstack.org/#/c/370452/ 15:44:57 <ndahiwade> Is ready for review 15:45:00 <ihrachys> ok 15:45:07 <ihrachys> during ptg and right after we finally made some progress on OVO devref 15:45:17 <electrocucaracha> actually I have other three that are also ready for review 15:45:33 <ihrachys> thanks to korzen and dasm and others, the patch got first +2: https://review.openstack.org/#/c/336518/ 15:45:58 <ihrachys> electrocucaracha: understood 15:46:09 <manjeets> https://review.openstack.org/#/c/336518/ 15:46:19 <manjeets> https://review.openstack.org/#/c/353088 15:46:37 <ihrachys> ok, ok :) 15:46:44 <ihrachys> I wanted to discuss something more general 15:46:55 <ihrachys> several patches was struggling with lock_for_update 15:47:07 * electrocucaracha was tented to put links also here 15:47:14 <ihrachys> segmentation allocations (vlan and tunnels), also quotas 15:48:17 <ihrachys> we discussed the matter with kevinbenton during ptg and tend to agree that instead of exposing lock_for_update semantics through OVO layer, we better kill those remaining places that still use the lock, rewriting it using compare-and-swap technique and such 15:48:36 <ihrachys> for that matter, kevinbenton already proposed a patch for allocations: https://review.openstack.org/#/c/438144/ 15:49:00 <ihrachys> we will need another one to make progress on quotas 15:49:58 * manjeets will try to handle to it today 15:50:22 <ihrachys> I also took another look at tonytan4ever's LIKE support for get_objects: https://review.openstack.org/#/c/419152/ and it seems rather fine though we will need the same for delete_objects and count, so a respin will be needed. 15:50:32 <ihrachys> manjeets: ok cool 15:50:35 <tonytan4ever> thanks. 15:50:49 <ihrachys> #action manjeets to remove lock_for_update for quotas db code 15:51:11 <tonytan4ever> I am looking forward for comments and adding stuff for delete_object. 15:52:01 <ihrachys> ok as for other patches, we will eventually get to them 15:52:16 <ihrachys> I encourage everyone to cross review 15:52:38 <ihrachys> any questions before we move on? 15:52:49 <electrocucaracha> ihrachys: there are 10 mins remanding for this meeting, I'd like to discuss if there is a way that we can focus in a subgroup of patches to have more progress 15:53:30 <electrocucaracha> ihrachys: I mean, all the patches are super important to review, but there is an order on them 15:53:43 <electrocucaracha> ihrachys: like the segments that you mentioned before 15:54:16 <ihrachys> electrocucaracha: I think the most important bits right now are actually not OVO (except port bindings work) but online data migration 15:54:28 <electrocucaracha> ihrachys: I was thinking to use the spreadsheet to highlight those ones which we can consider as ready 15:55:28 <electrocucaracha> ihrachys: that means the OVO patches will keep as they are until we merge other things? 15:55:35 <ihrachys> electrocucaracha: I tend to review what goes my way, not what's in a spreadsheet. because I forget things and so regular pings help in my case. 15:55:57 <ihrachys> electrocucaracha: some OVO patches can make progress without waiting on other things, it's not black and white 15:56:22 <ihrachys> but f.e. quotas are blocked, so we wait there 15:56:45 <ihrachys> electrocucaracha: let's put that topic into the agenda for the next meeting. we will need more time to chew it. 15:56:53 <electrocucaracha> ihrachys: I was looking for mechanisms to narrow down and focus the code reviews in order to accelerate things 15:56:54 <ihrachys> electrocucaracha: would you mind updating the wiki page? 15:57:08 <electrocucaracha> ihrachys: I can do it 15:57:32 <ihrachys> electrocucaracha: come up with one patch every day that your fellows should review NOW and send it my way, and other's way 15:57:52 <ihrachys> and I will try to prioritize on spot 15:58:03 <electrocucaracha> +1 ^ 15:58:36 <electrocucaracha> actually it was a request from armax a couple of neutron meetings ago 15:58:42 <ihrachys> my attention span is narrow (I am probably too old already?), you need to bear with me :) 15:59:11 <ihrachys> electrocucaracha: ok I haven't noticed that request. probably missed the meeting or smth. 15:59:13 * electrocucaracha 1 minute 15:59:27 <ihrachys> electrocucaracha: ok please propose the topic and we will dedicate time 15:59:31 <ihrachys> thanks everyone 15:59:34 <ihrachys> #endmeeting