15:01:35 #startmeeting neutron_upgrades 15:01:37 Meeting started Mon Mar 7 15:01:35 2016 UTC and is due to finish in 60 minutes. The chair is ihrachys. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:38 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:39 hi 15:01:40 The meeting name has been set to 'neutron_upgrades' 15:01:45 hello 15:01:46 hello 15:01:52 hi :) 15:02:05 #topic Announcements 15:02:36 so we have the code sprint next week March 14-16 in Brno. I guess everyone who planned to go already booked hotels/flights. 15:03:00 it's worth we work on the etherpad for the sprint before the next week 15:03:01 #link https://etherpad.openstack.org/p/code-sprint-neutron-objects-brno 15:03:21 right :) 15:03:23 I have added a few points worth discussing 15:03:35 korzen is cool, so he started a draft of agenda for the sprint, line 42+ 15:03:58 * ihrachys is not cool, so he did not 15:04:12 rossella_s: hello 15:04:20 korzen: discussing here or at the event? 15:05:01 Hello, sorry i'm late 15:05:02 we can go quickly right now 15:05:10 korzen: ok, please lead :) 15:05:45 enriquetaso, welcome 15:05:52 ok, so the first one the obvious: Port all database entities to OVO 15:06:23 I guess that it would be nice to have the list and share the items with volountiers 15:06:42 yay. I guess we should come with some short-term plan for the sprint that is more realistic. 15:06:46 so if we have new people we can point to the TODO list 15:07:23 let's say we at least solve port and subnets on the sprint and document everything. 15:07:33 in the objectification it would be also nice to have performance comparison before OVO and after 15:07:34 ihrachys, agreed 15:07:49 korzen: rally jobs should give some insight 15:08:23 As we discussed earlier - we can define use cases for OVO 15:08:34 it means, where the OVO would be used 15:08:58 ihrachys, did you hear back from armax, is it confirmed that Newton will be open? 15:09:16 rossella_s: not yet. I think it was Fri the last time we talked. 15:09:27 and how to go forward with online schema migration and online schema migration 15:09:34 rossella_s: newton will be open as soon as stable/mitaka is open, that should happen at the end of this week 15:09:47 rossella_s: or first thing next week 15:09:55 armax, great, thanks a lot! 15:10:05 thanks armax, great news 15:10:13 armax++ 15:10:20 armax, welcome! :) 15:10:34 so ihrachys can we aim at having port and subnet merged and working in Newton after the sprint? :) 15:10:39 so we should be ready to land stuff on the sprint 15:10:45 no problem folks :) keep up the good work! 15:10:51 rossella_s: I think we should, yes. :) 15:11:25 that would be a lot of stuff already, and we'll have many foundational blocks already set in tree for quick proceeding on remaining resources 15:11:37 will port also include the port extensions? 15:11:40 yep that would be perfect 15:11:47 remember, one thing that should come up from the first objects we'll land is a detailed documentation on transition process. 15:11:47 slunkad, yes 15:12:05 slunkad: I think that's a requirement to land port, yes 15:12:07 ihrachys, good reminder 15:12:37 I guess that would be enough work for 3 days ;) 15:12:40 cool 15:13:01 korzen: no time for food or beer! :) 15:13:03 korzen: :-) but I guess we'll need to touch other topics at least to set everyone on the same page for Newton 15:13:25 ok, just joking, what can we do more is Grenade testing 15:13:27 korzen: you had some more topics in the etherpad. dare to elaborate? 15:14:04 Grenade or upgrade testing lacks of dataplane connectivity test in CI 15:14:05 + on grenade. I think we made a huge progress in Mitaka, but it's not enough, we should strive for more :) 15:14:36 we should work on solution enabling the flow testing are not dropped 15:15:05 korzen: actually, for agent restart, I saw a patch from hmlnarik lately that had a functional test that validates just that. right rossella_s ? 15:15:19 rossella_s: I saw you were reviewing that test. [unless I halucinate] 15:15:38 it's not exactly upgrade, but could be a good start. 15:15:40 ihrachys, i don't know how you can keep up with so many things, that's right 15:15:58 that thing: 15:16:00 #link https://review.openstack.org/#/c/284639/8/neutron/tests/functional/agent/test_l2_ovs_agent.py 15:16:06 ihrachys, right it's a good start, it not about upgrade but about restarting the agent 15:16:35 rossella_s: huh. Hynek sits right to me in the office, that's how :P 15:16:51 the question is, can we take advantage of it in Grenade tests? 15:16:57 long story short, the test simulates agent restarts and checks that ping works. 15:17:41 ihrachys, oh really? :D I had a conversation with him regarding the test... 15:17:57 korzen: for grenade, I guess no. we would need to come up with some background process running during l2 agent upgrade (that would go to usual, 'non-partial' grenade flavor) 15:18:42 korzen: ok, anything else on your agenda for sprint? 15:19:10 I guess we can also talk about the RPC callback being used for Port or smth 15:19:27 definitly for Newton release 15:20:26 korzen: yes, we may need to start thinking of how we make RPC layer more strict. I don't think we'll spend much time actually coding anything there on the sprint, but it's good to have a brief discussion on the next steps. 15:20:50 and also the reference architecture for rolling upgrade scenario in Neutron 15:21:08 ihrachys, I agree, it definitely needs discussion. We can start during the sprint but I think we will have to talk about it again at the summit 15:21:27 korzen: can you elaborate on the last one? is it some detailed description of upgrade process and what's supported somewhere in user visible docs? 15:22:25 I'm seeing the ref implementation as an description of what Neutron components we are testing 15:22:33 and procedure as well 15:22:40 procedure of upgrades 15:22:58 korzen: something in networking-guide? what's the audience? 15:23:35 audence should be operators 15:25:10 I'm not sure if it is worth dicussing on the code sprint but just my thought 15:25:11 korzen: ok, so I guess it's networking-guide material 15:25:39 I guess we'll discuss a lot of stuff, that's good to collect ideas, even if small :) 15:26:05 ok great. all in all, I encourage everyone to go the etherpad and build agenda on top of what korzen contributed. 15:26:11 korzen: thanks for doing it. 15:26:20 ihrachys, np. 15:26:30 korzen:++ 15:26:44 hi, sorry i'm late 15:26:44 let's move to more general topics (even though we touched some of them already) 15:26:51 #topic Object implementation 15:27:03 for that, there was not much progress merging patches due to Mitaka freeze 15:27:10 as we discussed already, master will be open next week. 15:27:16 attending bug hackathon in the castle 15:27:16 so we'll be able to break the world :) 15:27:42 I suspect there is not much need to cover specific patches for objects. 15:27:52 my patch got merged: https://review.openstack.org/#/c/275790/ composite primary key 15:27:59 yay 15:27:59 overall, the only comment I have is that 15:28:04 korzen: well done 15:28:08 korzen: great 15:28:16 ...if you have patches in review, you better get them in shape for sprint time 15:28:17 korzen, :) 15:28:26 I think the most important one now is: https://review.openstack.org/#/c/283711 15:28:26 Handle synthetic fields in NeutronDbObject 15:28:28 so that we don't lag on merging/reviewing them ;) 15:29:34 yes, so I would work on SUbnet and maybe network patches to get up to date :) 15:29:35 I am updating the patch according to korzen's comment 15:29:38 korzen: will need to wait till Mitaka. we could land some custom type patches this week since they are isolated, but that's it. 15:29:39 rossella_s: maybe set topic to ov for https://review.openstack.org/#/c/283711? 15:29:46 *ovo 15:30:17 ihrachys; custom type patches seem to be locked down? 15:30:19 mhickey, right 15:30:35 mhickey: locked down as in '-2 from Armando'? 15:31:05 ihrachys: yes, https://review.openstack.org/#/c/277558/ 15:31:14 ok, if something of those patches is really ready to land, we can talk it thru with Armando case by case 15:31:49 strangely enough, cidr is not: https://review.openstack.org/#/c/285349/ 15:32:18 yay :) 15:32:30 korzen: lol 15:32:35 a glitch I guess. or the fact that I +2 on the first one, so there was a risk it will land with no approval. 15:32:49 ok, let's solve it off the meeting 15:32:57 #topic Partial Multinode Grenade 15:33:01 ihrachys: maybe not, I might have upset armax at the mid-cycle! :) 15:33:26 for partial, I guess we are waiting for N to open to enable voting for the job 15:33:29 mhickey: far from it! 15:33:35 we also landed DVR experimental job 15:33:40 armax: ciao 15:33:50 mhickey: buongiorno! 15:34:00 mhickey: scusa, buonasera 15:34:07 and I heard Sean made a huge progress on that one, making it passing the vote in the experimental queue 15:34:21 ihrachys: the dvr one? 15:34:24 sadly, Sean is not here right now (he was planning to travel by train, that's why) 15:34:25 yes, the multinode DVR job was merged last week, I have done the initial run: http://logs.openstack.org/50/281850/7/experimental/gate-grenade-dsvm-neutron-dvr-multinode/7b81449/ 15:34:42 armax: yes, dvr passed with some patch from Sean, as per Sean himself 15:34:49 ihrachys: nice 15:34:58 armax: I thought we will get details here, but probably not right now 15:35:07 armax: it is huge, not just nice. 15:35:34 ihrachys: you’re younger and more enthusiastic than me 15:35:38 also, armax proposed a patch to add rolling-upgrade tag to neutron 15:35:39 ihrachys: it’s understandable 15:35:40 #link https://review.openstack.org/#/c/286817/ 15:35:43 armax: :P 15:36:01 that said, armax seems to be unconvinced what we have justifies the tag right now 15:36:23 if I understand you correctly, you want us to evaluate *aas gates for partial job too, right? 15:36:55 ihrachys: I am mulling over the idea, I am not entirely convinced 15:37:05 ihrachys: but ultimately that’s down to what the stadium looks like 15:37:27 ihrachys: because if we turn out to remove all projects but neutron, then even the question goes away 15:37:50 I guess one thing we may want to deliver to the project after the sprint is some high level description of what we consider a proper subset of upgrade scenarios that would reflect real deployments and won't require a dozen of new jobs. 15:38:26 even if not *aas in the scope, then we have e.g. mixed L3 HA agent versions scenario (raised by Sean today) 15:38:54 any thoughts folks on where we should claim 'done'? 15:38:56 ihrachys, can you elaborate more on the L3 HA scenario? 15:39:38 ihrachys: I don’t think the L3 HA scenario should be added to the mix, but ideally we could simply consider a multinode 15:39:41 korzen: if you run HA router served by two separate nodes, you may want to make sure that it's still working (VRRP talking) when you upgrade a node per step 15:39:41 DVR+HA 15:39:58 I think ultimately the most intereting multinode testing 15:40:09 the others, perhaps we can afford to have them set up as periodic 15:40:41 armax: one more general question to tackle is when we talk about rolling scenarios, do we envision per-service upgrade or per-node upgrade? 15:40:55 ihrachys: what do you mean? 15:40:57 the difference is that in the former case you would want to test e.g. new nova-compute with old l2 agent. 15:41:19 and in the latter, you would assume a compute node runs the same major version of all components of all projects 15:41:29 ihrachys: I think it’s safe to assume the latter 15:41:37 and the same would go about 'networking' node that would run l3 agent and dhcp agent and whatnot 15:41:42 ihrachys: but we can double check with the nova guys 15:42:05 yes, latter should be safe. also it should reflect reality. [at least until major use case is containerized] 15:42:06 typically you’d refresh all deps on a single node 15:42:12 ihrachys: right 15:42:35 armax: I talked to dansmith about that, he was actually pretty happy we DON'T assume per-service upgrades 15:42:43 ihrachys: that said, it should be entirely possible to have nova N to work with Neutron N-1 15:43:12 armax: of course. the question is more about testing matrix than about what we claim to support. 15:43:14 ihrachys: we don’t in fact 15:43:46 ihrachys: there’s not any amount of testing that can replace a good and intelligent review 15:43:52 from a human being 15:43:55 amen on that 15:44:27 are there any teams already doing the rolling upgrades in scale env? 15:44:50 not the dev setup, the large servers with 200+ VMs on it 15:45:54 it would be nice to have tests done in semi automated way testing the project upgrade during last day of dev cycle 15:45:55 not that I know. we have very limited rolling scenario testing in-house 15:46:39 because what we have to do now, is to setup M-3 and check the liberty to M-3 or Mitaka RC1 upgrade 15:47:10 to be sure before release that nothing critcal got merdeg during Mitaka 15:47:27 s/merdeg/merged 15:48:05 korzen, I don't think we have anything like that 15:48:22 I hear you. it always comes with the question on who is going to do the testing (and documentation of the process for later upgrade checks) 15:49:25 ihrachys, korzen are we a bit OT now? 15:49:27 overall, that raises the question of neutron not having any major release check list that PTL or someone in charge could go thru and validate that all usual release stuff is covered. 15:49:37 yeah, we probably are :) 15:49:41 let's move on 15:49:53 actually, I have nothing so... 15:49:55 we can discuss that after the meeting 15:50:05 #topic Open discussion 15:50:42 talking about test the L3 HA in CI 15:50:43 I just wanted to say, let's brush up the ovo patches these week so that we can easily merge them during the sprint 15:50:53 we would need to have 3 node setup right? 15:50:58 rossella_s++ 15:51:00 rossella_s: Agreed ++ 15:51:44 clearing these patches in progress, could clear the way for whats next TBD 15:51:50 korzen: probably yes. controller (running l3 agent) + networking node (the agent too) + compute 15:52:32 currently there are 2 nodes used 15:52:55 it would be hard work to setup 3 nodes 15:53:10 first we should discuss whether it's what we want, then we may look into what's missing on devstack-gate side 15:53:18 mostly in devstack-gate 15:53:19 I believe devstack-gate supports more nodes 15:53:31 I saw its code referring to subnode arrays 15:54:11 I am sure there are devils in details though 15:54:29 but configuration is written up to handle controller and subnode 15:54:47 it woudl require more 'if's 15:54:54 to add third node :P 15:55:05 more spaghetti, yes. as if devstack-gate does not have enough of it. 15:55:12 I am curious, why would three nodes be useful? 15:55:12 is someone already working on the Port security? 15:55:34 since we decided to brush up all the ovo patches.. I am not sure we have a patch for that 15:56:09 slunkad, enriquetaso is working on it 15:56:14 johnthetubaguy, the thee node is useful when you have testing the multiple L3 agents 15:56:14 johnthetubaguy: which nodes are going to run l3 agent? or do you suggest we should land l3 agent on compute node to save a node? 15:56:23 oh awesome enriquetaso 15:56:28 :) 15:57:02 I guess I was thinking run that on one of the existing nodes, yeah 15:57:03 btw enriquetaso is an applicant for this round of the Outreachy internship 15:57:22 for those who don't know :) 15:57:33 ihrachys: so yeah, +1 your suggestion, but that might complicate some of the routes, etc, I guess, to that might be a bad idea 15:57:36 enriquetaso: I'm working on it haha 15:57:49 enriquetaso: great! 15:58:38 johnthetubaguy: we may think indeed about landing it on compute. as long as other compute services are running old code, that can give us proper coverage. 15:58:41 need to think about it. 15:59:10 ok folks, we need to wrap up. please make sure all OVO patches are in good shape for the next week sprint, and keep up the good work! 15:59:19 ihrachys: ack 15:59:21 thanks ihrachys 15:59:25 yep! bye :) 15:59:29 thanks, bye 15:59:32 thanks ihrachys 15:59:33 thank you 15:59:37 btw I will try to arrange a video conf option for the code sprint, will post details later. 15:59:37 thanks, bye 15:59:41 #endmeeting