16:00:33 #startmeeting neutron_ci 16:00:34 Meeting started Tue Oct 3 16:00:33 2017 UTC and is due to finish in 60 minutes. The chair is ihrachys. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:35 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:37 The meeting name has been set to 'neutron_ci' 16:01:39 o/ 16:01:44 before we start, I'd like to mention that I was not very attentive to upstream fallout lately so I may miss crucial things. if so, speak up. 16:01:52 #topic Action items from prev week 16:02:20 we had two items for the same 16:02:28 "ihrachys to report bug for iptables apply failure" and "jlibosva to triage iptables apply failure in linuxbridge scenarios job" 16:02:39 I am afraid I haven't done the job, but let me check 16:02:49 I haven't found time to look at it 16:02:59 oh I actually did, wow https://bugs.launchpad.net/neutron/+bug/1719711 16:03:01 Launchpad bug 1719711 in neutron "iptables failed to apply when binding a port with AGENT.debug_iptables_rules enabled" [High,Confirmed] 16:03:17 my memory bank is not long enough it seems. 16:03:45 jlibosva, will you? or we should find someone else? 16:04:26 I'd love look at it but I was busy with some other things lately .. 16:04:42 I was also two days off last week so that's my excuse :) 16:04:53 ihrachys: i can look, just didn't have time this past week, but since i have another iptables issue on my plate i will have that part of my brain swapped-in 16:04:58 we are not to blame here :) 16:05:13 jlibosva, sounds fair if we pass the cake to haleyb ? 16:05:19 sure 16:05:23 thank haleyb :) 16:05:25 s 16:05:35 ok, assigned to haleyb 16:05:38 haleyb++ 16:05:53 #topic Grafana 16:06:05 grafana is dead: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 16:06:10 no data points 16:06:15 probably a fallout of zuulv3 switch 16:06:30 haleyb, I remember you asked about it. was there any progress after that to get it back? 16:06:38 any patches to chew? 16:07:17 ihrachys: no, that was near end of day here, but i could take a look. this zuulv3 change was not as clean as i expected 16:07:36 haleyb, I suspect it's because job names changed 16:07:44 maybe our board was not updated with new 16:07:54 ihrachys: yes, there's a lot of legacy-* now 16:08:20 yeah, no 'legacy' matches in grafana/neutron.yaml 16:08:30 #action haleyb to update grafana board with new job names 16:08:48 I hope this part of the repo is still fresh and we don't need to learn more new ways 16:09:00 who is going to update all the jobs... :( 16:09:18 they can live as legacy for a while 16:09:37 the main problem is that we now are on the hook to migrate them if we need improvements/new jobs... 16:09:47 a lot of patches were caught in flight 16:10:05 mlavalle, are you aware of anyone working on migration to new job format? 16:10:28 no 16:10:36 I might wait to see what happens with zuul v3 before doing too much work.. based on the latest ML thread there are a lot of problems and folks are starting to talk about a revert if we can’t get the gates healthy with v3 16:10:58 I pinged back the person who asked about it on Friday 16:11:06 but didn't get back to me 16:11:06 boden, wow that's harsh 16:11:43 http://lists.openstack.org/pipermail/openstack-dev/2017-October/123022.html 16:11:52 well it has been pretty disruptive 16:12:09 neutron-lib gate is on the floor and there are also issues with neutron gate 16:13:22 yeah, I see patches falling with POST_FAILUREs 16:13:29 I thought we were past that? 16:13:34 there are other problems 16:13:37 apparently more bits of the puzzle were deployed 16:14:02 boden, do we have a list of grievances on our side? 16:14:16 I’ve been adding them here https://etherpad.openstack.org/p/zuulv3-migration-faq 16:14:44 so far for neutron I only noticed the legacy releasenotes job busted… but its busted across the board best I can tell 16:14:50 neutron-lbi is diff story 16:15:27 do we have a list of neutron related jobs that are known to be unstable/broken? 16:16:16 the only list I have is that faq… but its hard to tell right now b/c there are random POST_FAILURES that are not related to gate “job logic” best I can tell 16:16:42 shall we attempt to focus on one pipeline at the time? there might be common problems and once we identified those it’s easier to do a sweep across the board? 16:17:01 yeah, that would be nice to have something neutron specific. I think we could do a quick triage of failures in our gates based on late patches and have a list that we would then run against what's in faq, and if smth is not there, escalate it to infra 16:17:01 perhaps the neutron-lib pipeline is easier to bring back to sanity? 16:17:20 armax: TBH I think it has more problems than neutron 16:17:21 then go to neutron and the other networking-* projects? 16:17:27 boden: even better :) 16:17:36 I’d say get neutron working 1st 16:17:46 ok 16:17:47 I know for sure the legacy releasenotes is busted 16:17:49 let's start the pad 16:17:57 #link https://etherpad.openstack.org/p/neutron-zuulv3-grievances Etherpad for zuulv3 grievances 16:18:20 why not just add to https://etherpad.openstack.org/p/zuulv3-migration-faq so everyone knows the issues 16:18:26 other people might have similar problems 16:18:35 other people = other projects 16:18:47 I think it makes sense for ourselves to understand what's the fallout and then cross-match with what they alredy track 16:18:55 cool 16:18:57 yeah 16:19:03 I don't intend to have it forever, just to classify and pass over 16:19:43 we have neutron, -lib, and client + stable branches to classify 16:20:08 what about we have a liasion on each of these areas, tasked to report a status of the gate by EOB? 16:20:12 FYI: I did land this patch to try and fix a lib gate job: https://review.openstack.org/#/c/508945/ 16:20:16 we could split those right now and work the next 24h on getting the full picture? 16:20:23 then perhaps we can have a ad hoc sync-up tomorrow to see where we are? 16:20:26 armax, yeah + 16:20:41 I can take neutron-lib 16:20:54 I take stable 16:21:07 all of them 16:21:12 OK 16:21:30 I'll take Neutron 16:21:31 who's on neutron? 16:21:34 great 16:21:37 who wants to take the neutron- and networking- ones? 16:21:46 team team team team. if you can't work as a team... 16:22:49 mlavalle, I put your name in the pad 16:22:54 I dont mind to help, problem is I’m nearly gate illiterate 16:22:55 ++ 16:22:56 more volunteers for the rest? 16:23:31 i can look at client 16:23:33 I’ll have a look at the periodic runs I have time 16:23:41 haleyb: take it off of me then 16:23:43 haleyb, check the list of not assigned in the pad 16:23:59 I added myself but happy to hand it over :) 16:24:12 * haleyb takes a step back :) 16:24:34 chicken 16:24:38 :) 16:24:42 ok. if nothing else, I am not too nervous about networking- / neutron- / periodic at this point 16:24:53 it's on subteams (except periodic that doesn't block) 16:25:03 so for now we don’t worry about the non-voting jobs, right? 16:25:06 thanks for everyone who is not a chicken 16:25:08 :) 16:25:19 armax, yeah, goal is unblock gate 16:25:24 * mlavalle is a chicken but is trying to hide it 16:25:29 we will review the gate in a week for others 16:25:52 I think let’s try to figure out the grafana black hole asap 16:25:57 we can’t fix what we can’t see 16:25:58 we're more like squirrels who all got run over by the zuulv3 bus, and it's backing-up now 16:26:25 armax: i was going to look at grafana, most likely just needs update to job names 16:26:43 haleyb: true, but we’ll have to move away from the legacy- prefix sooner or later 16:26:59 that's after we are back on our legs 16:27:00 * jlibosva is also chicken 16:27:02 unless we want to create a legacy dashboard 16:27:20 either way 16:27:32 armax: i hope we don't have to do that 16:27:41 I don't think infra in a position to force the switch through further 16:27:47 great, I get a status.json: Proxy Error when looking at http://zuulv3.openstack.org/ 16:28:09 ihrachys: agreed, but I loathe to see ‘legacy’ everywhere :) 16:28:12 haleyb, I don't think that's realistic hope 16:28:35 ihrachys: you mean not having a legacy dash? 16:28:40 I hope the translation is pretty straightforward 16:28:43 yeah 16:28:43 haleyb, not having to do it 16:28:54 armax, it's switch to ansible man 16:29:07 well, we could call a bash script from there I guess? 16:29:22 ihrachys: rock. on. 16:29:27 i will see what other dashboard changes landed recently i guess 16:29:41 would probably make sense to move definitions as scripts into neutron tree; then work on ansible switch as needed. 16:30:10 so excited of all the productive work we are about to do 16:30:15 meh 16:31:04 I don't think we have anything to discuss for grafana or gate without data, so let's move on 16:31:06 #topic Bugs 16:31:16 https://bugs.launchpad.net/neutron/+bugs?field.tag=gate-failure 16:31:44 I don't see anything new in the list except the iptables apply issue that haleyb will look at 16:31:55 so we can focus on gate for the most part. 16:32:04 great 16:32:33 #topic Fullstack 16:32:56 I am not sure there is a lot of reason to discuss fullstack or scenarios at the point where we are right now. thoughts? 16:33:09 agree 16:33:27 ok 16:33:32 it’s like discussing that the house is dirty when the roof is on fire 16:33:39 #topic Open discussion 16:33:54 anything critical that is more important than putting the fire off to discuss? 16:34:07 back to the zuulv3 topic any idea what to do with node_failure errors? 16:34:20 it seems like infra is fighting stability issues of their own? 16:34:45 armax; yes, that’s what I said earlier… the infra isn’t even stable enough to test the gate jobs now 16:35:11 armax, maybe ask them about whether they have it fixed, or when it's going to be fixed 16:35:15 so maybe talking to them, to get a sense as to where they stand 16:35:34 maybe they will tell us straight they revert and we don't need to do the work in the first place :) 16:35:36 OK, I am going to learn a bit about this today so that I can ask intelligent questions 16:36:02 ihrachys: that's a nice dream 16:36:21 mlavalle, not really, only means you will go through another round of pain in the future 16:36:32 that's true 16:36:36 ok, thanks everyone, let's follow up with infra, and classify. that will be something already. 16:36:39 keep up 16:36:43 #endmeeting