15:01:24 <ihrachys> #startmeeting neutron_upgrades
15:01:25 <openstack> Meeting started Mon Mar  6 15:01:24 2017 UTC and is due to finish in 60 minutes.  The chair is ihrachys. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:01:27 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:01:27 <electrocucaracha> o/
15:01:29 <openstack> The meeting name has been set to 'neutron_upgrades'
15:01:36 <dasanind> o/
15:01:45 <ihrachys> hi everyone
15:01:58 <ihrachys> #link https://wiki.openstack.org/wiki/Meetings/Neutron-Upgrades-Subteam Agenda
15:02:13 <ihrachys> I updated the page ^ somewhat to reflect things we track more closely
15:02:31 <ihrachys> #topic Announcements
15:02:48 <ihrachys> Atlanta PTG happened!
15:03:17 <ihrachys> you can find a recap for upgrades topics touched at:
15:03:25 <ihrachys> #link http://lists.openstack.org/pipermail/openstack-dev/2017-March/113371.html PTG upgrades recap
15:03:49 <ihrachys> thanks to dasanind and other participants who helped to prepare the report
15:04:42 <ihrachys> it's a long read I admit, so I guess we can move on and then discuss specifics on a next meeting if needed
15:05:59 <electrocucaracha> +1
15:06:06 <ihrachys> this meeting is the first after PTG and in the new Pike cycle, so I would like to review action items so far tracked by our team, and see if anything is no longer valid and hence should not be tracked
15:06:55 <ihrachys> #topic Linuxbridge multinode grenade job
15:07:22 <manjeets> o/
15:08:18 <ihrachys> the job is not progressing anywhere for quite some time, as can be seen at grafana dashboard
15:08:19 <ihrachys> #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate?panelId=8&fullscreen
15:08:34 <ihrachys> (it's gate-grenade-dsvm-neutron-linuxbridge-multinode-ubuntu-xenial-nv)
15:09:11 <ihrachys> I was initially tracking it, and once we were even quite close to pass it and make it voting, but then events happened
15:09:26 <ihrachys> and I haven't found time to get back to it (neither motivation)
15:09:44 <ihrachys> which makes me wonder - is anyone in the group interested to taking it over?
15:10:19 <ihrachys> I talked to kevinbenton during ptg and he said he will take it over
15:10:26 <ihrachys> or will try to
15:10:51 <ihrachys> the question is, whether anyone is interested from our side to make an effort and update the group about its progress
15:11:06 <ihrachys> if not, I will need to drop it from the agenda (and move into backlog)
15:11:36 <manjeets> ihrachys, I can help with that
15:11:49 <manjeets> but may need some guidance some times
15:12:43 <ihrachys> manjeets: ok I assume you take it over
15:13:03 <ihrachys> #action ihrachys to follow up with manjeets on next steps for linuxbridge multinode grenade
15:13:22 <ihrachys> note that ovs grenade multinode is doing well, it's just linuxbridge backend that would need some love
15:13:31 <manjeets> okay
15:13:44 <ihrachys> manjeets: thanks for taking over
15:13:56 <ihrachys> #topic Mixed server versions
15:14:41 <ihrachys> to recap for those new to the subteam, it's to support what's defined by the following tag: https://governance.openstack.org/tc/reference/tags/assert_supports-zero-downtime-upgrade.html
15:15:19 <ihrachys> korzen did some simplified local testing before and validated that Newton -> Ocata upgrade can indeed be executed in mixed mode
15:15:28 <ihrachys> which is a nice achievement already
15:15:43 <ihrachys> there are missing bits that we should close though
15:16:01 <ihrachys> one is online data migration framework
15:16:07 <manjeets> so are we close to minimal needed
15:16:35 <ihrachys> manjeets: yeah, though we will need some more bits, and gating, to get the tag
15:17:10 <ihrachys> one missing bit is a new neutron-db-manage command to enforce data migration between tables. electrocucaracha proposed one before as WIP: https://review.openstack.org/#/c/432494/
15:17:39 <ihrachys> electrocucaracha: what's the status of it, and are you blocked on making progress there? do you expect some more reviews before proceeding with it?
15:18:02 <electrocucaracha> ihrachys: well I was expecting some comments or anything about it during the PTG
15:18:19 <electrocucaracha> ihrachys: if that's a good idea or if we can use a different approach
15:18:37 <electrocucaracha> ihrachys: our major difference against nova is that we use alembic
15:18:50 <electrocucaracha> ihrachys: which makes the things a little bit different
15:19:50 * manjeets reviewlist.enqueue( electrocucaracha 's patch)
15:19:54 <ihrachys> electrocucaracha: I think there is agreement it's the right direction, but I get you want some feedback on direction before spending more time on it
15:19:59 <electrocucaracha> ihrachys: in the other hand, I took a pointless example maybe I can work in a more realistic one
15:20:27 <ihrachys> #action everyone to review online data migration neutron-db-manage command: https://review.openstack.org/#/c/432494/
15:20:51 <ihrachys> electrocucaracha: if you have one, sure
15:22:16 <ihrachys> another missing bit that we discussed quite extensively during PTG is to tackle differences in API behaviour between different major versions of server in a cluster
15:22:48 <ihrachys> basically, the concern is that sometimes it may be unsafe to expose new extensions via /extensions/ API before every node is on the new code
15:23:18 <ihrachys> this is especially problematic in cases where new API is used programmatically (like for new port bindings used by nova)
15:23:37 <ihrachys> in which case you can't just tell your users to avoid using new API before upgrade is complete
15:24:53 <ihrachys> so we were thinking about how to tackle that, and came up with some idea where servers will 1) report their supported extensions to others (probably in db); and 2) use that knowledge to enable/disable api extensions as needed.
15:25:18 <ihrachys> basically, falling back to the common set of extensions supported by all nodes
15:25:27 <manjeets> minimal subset b/w mixed versions
15:25:42 <ihrachys> actually maximum, though common
15:25:49 <manjeets> ok
15:26:12 <ihrachys> which in case of new port bindings would mean that nova will not see new extension supported till we are done with full upgrade, at which point it will switch to using it
15:26:29 <ihrachys> there are details to shuffle and code to write, and it's on me to write it up in specs format
15:27:16 <ihrachys> #action ihrachys to spec a mechanism to tackle differences in the list of extensions exposed by multiple mixed server nodes
15:27:45 <ihrachys> that was merely an update on that happening, and we will discuss it in detail once I have something on gerrit
15:28:25 <ihrachys> before we switch to the next topic, let's also look at how we tackle gating for the mixed servers feature
15:28:28 <manjeets> ihrachys, one question
15:28:33 <ihrachys> manjeets: sure
15:28:46 <manjeets> you said supported extensions
15:29:02 <manjeets> do servers have to register supported or active ones ?
15:29:24 <ihrachys> servers register supported but load common only
15:29:36 <ihrachys> meaning, active <= supported for each specific node
15:29:53 <ihrachys> and active lists are identical on all nodes (except some caveat that I will need to describe in the spec)
15:30:44 <ihrachys> manjeets: does it cover your question?
15:31:12 <manjeets> yes thanks,
15:33:01 <ihrachys> ok cool
15:33:08 <ihrachys> so, gating matters for mixed versions
15:33:11 <manjeets> spec will cover more details so good atm
15:33:58 <ihrachys> I think before korzen was tracking gate progress but considering that I hear Artur may leave us for some time, we may need to have another owner
15:34:24 <ihrachys> dolphm: where are we with mixed api grenade gate coverage for nova?
15:35:09 * electrocucaracha doesn't see dolphm in his desk, maybe is wfh
15:36:48 <ihrachys> ok let me follow up with korzen and dolphm after the meeting
15:37:02 <ihrachys> #action ihrachys to follow up on mixed server version gating
15:37:30 <ihrachys> any questions on the topic before we move to OVO?
15:38:03 <electrocucaracha> nope
15:38:18 <dasanind> ihrachys: will there be a new gate hi for mixed versions?
15:39:19 <dasanind> hi==job
15:39:29 <ihrachys> dasanind: yes, definitely
15:39:38 <ihrachys> dasanind: that's the promise of the governance tag
15:39:47 <ihrachys> we can't claim it until we prove in CI it works
15:39:58 <ihrachys> atm no project does claim that
15:40:08 <manjeets> ihrachys, would it be one job demonstrating rolling upgrades and mixed version stuff
15:40:12 <manjeets> or we need 2 ?
15:40:48 <ihrachys> manjeets: that's a very good question actually
15:41:07 <ihrachys> we can probably run two-node, one is all old, and second is just new neutron-server
15:41:21 <ihrachys> then we will cover both rpc compatibility as well as database rolling
15:41:32 <ihrachys> amirite?
15:41:54 <manjeets> right until they're actually tested
15:41:55 <manjeets> lol
15:43:43 <ihrachys> we will still need to have two jobs for some time
15:43:56 <ihrachys> because we can't safely roll in drastic changes into existing voting jobs
15:44:18 <ihrachys> but once we are sure it works we will be able to kill the existing one
15:44:28 <ihrachys> ok let's move to OVO
15:44:31 <ihrachys> #topic Object implementation
15:44:44 <ndahiwade> ihrachys, https://review.openstack.org/#/c/370452/
15:44:57 <ndahiwade> Is ready for review
15:45:00 <ihrachys> ok
15:45:07 <ihrachys> during ptg and right after we finally made some progress on OVO devref
15:45:17 <electrocucaracha> actually I have other three that are also ready for review
15:45:33 <ihrachys> thanks to korzen and dasm and others, the patch got first +2: https://review.openstack.org/#/c/336518/
15:45:58 <ihrachys> electrocucaracha: understood
15:46:09 <manjeets> https://review.openstack.org/#/c/336518/
15:46:19 <manjeets> https://review.openstack.org/#/c/353088
15:46:37 <ihrachys> ok, ok :)
15:46:44 <ihrachys> I wanted to discuss something more general
15:46:55 <ihrachys> several patches was struggling with lock_for_update
15:47:07 * electrocucaracha was tented to put links also here
15:47:14 <ihrachys> segmentation allocations (vlan and tunnels), also quotas
15:48:17 <ihrachys> we discussed the matter with kevinbenton during ptg and tend to agree that instead of exposing lock_for_update semantics through OVO layer, we better kill those remaining places that still use the lock, rewriting it using compare-and-swap technique and such
15:48:36 <ihrachys> for that matter, kevinbenton already proposed a patch for allocations: https://review.openstack.org/#/c/438144/
15:49:00 <ihrachys> we will need another one to make progress on quotas
15:49:58 * manjeets will try to handle to it today
15:50:22 <ihrachys> I also took another look at tonytan4ever's LIKE support for get_objects: https://review.openstack.org/#/c/419152/ and it seems rather fine though we will need the same for delete_objects and count, so a respin will be needed.
15:50:32 <ihrachys> manjeets: ok cool
15:50:35 <tonytan4ever> thanks.
15:50:49 <ihrachys> #action manjeets to remove lock_for_update for quotas db code
15:51:11 <tonytan4ever> I am looking forward for comments and adding stuff for delete_object.
15:52:01 <ihrachys> ok as for other patches, we will eventually get to them
15:52:16 <ihrachys> I encourage everyone to cross review
15:52:38 <ihrachys> any questions before we move on?
15:52:49 <electrocucaracha> ihrachys:  there are 10 mins remanding for this meeting, I'd like to discuss if there is a way that we can focus in a subgroup of patches to have more progress
15:53:30 <electrocucaracha> ihrachys: I mean, all the patches are super important to review, but there is an order on them
15:53:43 <electrocucaracha> ihrachys: like the segments that you mentioned before
15:54:16 <ihrachys> electrocucaracha: I think the most important bits right now are actually not OVO (except port bindings work) but online data migration
15:54:28 <electrocucaracha> ihrachys: I was thinking to use the spreadsheet to highlight those ones which we can consider as ready
15:55:28 <electrocucaracha> ihrachys: that means the OVO patches will keep as they are until we merge other things?
15:55:35 <ihrachys> electrocucaracha: I tend to review what goes my way, not what's in a spreadsheet. because I forget things and so regular pings help in my case.
15:55:57 <ihrachys> electrocucaracha: some OVO patches can make progress without waiting on other things, it's not black and white
15:56:22 <ihrachys> but f.e. quotas are blocked, so we wait there
15:56:45 <ihrachys> electrocucaracha: let's put that topic into the agenda for the next meeting. we will need more time to chew it.
15:56:53 <electrocucaracha> ihrachys: I was looking for mechanisms to narrow down and focus the code reviews in order to accelerate things
15:56:54 <ihrachys> electrocucaracha: would you mind updating the wiki page?
15:57:08 <electrocucaracha> ihrachys: I can do it
15:57:32 <ihrachys> electrocucaracha: come up with one patch every day that your fellows should review NOW and send it my way, and other's way
15:57:52 <ihrachys> and I will try to prioritize on spot
15:58:03 <electrocucaracha> +1 ^
15:58:36 <electrocucaracha> actually it was a request from armax a couple of neutron meetings ago
15:58:42 <ihrachys> my attention span is narrow (I am probably too old already?), you need to bear with me :)
15:59:11 <ihrachys> electrocucaracha: ok I haven't noticed that request. probably missed the meeting or smth.
15:59:13 * electrocucaracha 1 minute
15:59:27 <ihrachys> electrocucaracha: ok please propose the topic and we will dedicate time
15:59:31 <ihrachys> thanks everyone
15:59:34 <ihrachys> #endmeeting