16:00:33 <ihrachys> #startmeeting neutron_ci 16:00:34 <openstack> Meeting started Tue Oct 3 16:00:33 2017 UTC and is due to finish in 60 minutes. The chair is ihrachys. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:35 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:37 <openstack> The meeting name has been set to 'neutron_ci' 16:01:39 <mlavalle> o/ 16:01:44 <ihrachys> before we start, I'd like to mention that I was not very attentive to upstream fallout lately so I may miss crucial things. if so, speak up. 16:01:52 <ihrachys> #topic Action items from prev week 16:02:20 <ihrachys> we had two items for the same 16:02:28 <ihrachys> "ihrachys to report bug for iptables apply failure" and "jlibosva to triage iptables apply failure in linuxbridge scenarios job" 16:02:39 <ihrachys> I am afraid I haven't done the job, but let me check 16:02:49 <jlibosva> I haven't found time to look at it 16:02:59 <ihrachys> oh I actually did, wow https://bugs.launchpad.net/neutron/+bug/1719711 16:03:01 <openstack> Launchpad bug 1719711 in neutron "iptables failed to apply when binding a port with AGENT.debug_iptables_rules enabled" [High,Confirmed] 16:03:17 <ihrachys> my memory bank is not long enough it seems. 16:03:45 <ihrachys> jlibosva, will you? or we should find someone else? 16:04:26 <jlibosva> I'd love look at it but I was busy with some other things lately .. 16:04:42 <jlibosva> I was also two days off last week so that's my excuse :) 16:04:53 <haleyb> ihrachys: i can look, just didn't have time this past week, but since i have another iptables issue on my plate i will have that part of my brain swapped-in 16:04:58 <ihrachys> we are not to blame here :) 16:05:13 <ihrachys> jlibosva, sounds fair if we pass the cake to haleyb ? 16:05:19 <jlibosva> sure 16:05:23 <jlibosva> thank haleyb :) 16:05:25 <jlibosva> s 16:05:35 <ihrachys> ok, assigned to haleyb 16:05:38 <ihrachys> haleyb++ 16:05:53 <ihrachys> #topic Grafana 16:06:05 <ihrachys> grafana is dead: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 16:06:10 <ihrachys> no data points 16:06:15 <ihrachys> probably a fallout of zuulv3 switch 16:06:30 <ihrachys> haleyb, I remember you asked about it. was there any progress after that to get it back? 16:06:38 <ihrachys> any patches to chew? 16:07:17 <haleyb> ihrachys: no, that was near end of day here, but i could take a look. this zuulv3 change was not as clean as i expected 16:07:36 <ihrachys> haleyb, I suspect it's because job names changed 16:07:44 <ihrachys> maybe our board was not updated with new 16:07:54 <haleyb> ihrachys: yes, there's a lot of legacy-* now 16:08:20 <ihrachys> yeah, no 'legacy' matches in grafana/neutron.yaml 16:08:30 <ihrachys> #action haleyb to update grafana board with new job names 16:08:48 <ihrachys> I hope this part of the repo is still fresh and we don't need to learn more new ways 16:09:00 <haleyb> who is going to update all the jobs... :( 16:09:18 <ihrachys> they can live as legacy for a while 16:09:37 <ihrachys> the main problem is that we now are on the hook to migrate them if we need improvements/new jobs... 16:09:47 <ihrachys> a lot of patches were caught in flight 16:10:05 <ihrachys> mlavalle, are you aware of anyone working on migration to new job format? 16:10:28 <mlavalle> no 16:10:36 <boden> I might wait to see what happens with zuul v3 before doing too much work.. based on the latest ML thread there are a lot of problems and folks are starting to talk about a revert if we can’t get the gates healthy with v3 16:10:58 <mlavalle> I pinged back the person who asked about it on Friday 16:11:06 <mlavalle> but didn't get back to me 16:11:06 <ihrachys> boden, wow that's harsh 16:11:43 <boden> http://lists.openstack.org/pipermail/openstack-dev/2017-October/123022.html 16:11:52 <boden> well it has been pretty disruptive 16:12:09 <boden> neutron-lib gate is on the floor and there are also issues with neutron gate 16:13:22 <ihrachys> yeah, I see patches falling with POST_FAILUREs 16:13:29 <ihrachys> I thought we were past that? 16:13:34 <boden> there are other problems 16:13:37 <ihrachys> apparently more bits of the puzzle were deployed 16:14:02 <ihrachys> boden, do we have a list of grievances on our side? 16:14:16 <boden> I’ve been adding them here https://etherpad.openstack.org/p/zuulv3-migration-faq 16:14:44 <boden> so far for neutron I only noticed the legacy releasenotes job busted… but its busted across the board best I can tell 16:14:50 <boden> neutron-lbi is diff story 16:15:27 <armax> do we have a list of neutron related jobs that are known to be unstable/broken? 16:16:16 <boden> the only list I have is that faq… but its hard to tell right now b/c there are random POST_FAILURES that are not related to gate “job logic” best I can tell 16:16:42 <armax> shall we attempt to focus on one pipeline at the time? there might be common problems and once we identified those it’s easier to do a sweep across the board? 16:17:01 <ihrachys> yeah, that would be nice to have something neutron specific. I think we could do a quick triage of failures in our gates based on late patches and have a list that we would then run against what's in faq, and if smth is not there, escalate it to infra 16:17:01 <armax> perhaps the neutron-lib pipeline is easier to bring back to sanity? 16:17:20 <boden> armax: TBH I think it has more problems than neutron 16:17:21 <armax> then go to neutron and the other networking-* projects? 16:17:27 <armax> boden: even better :) 16:17:36 <boden> I’d say get neutron working 1st 16:17:46 <ihrachys> ok 16:17:47 <boden> I know for sure the legacy releasenotes is busted 16:17:49 <ihrachys> let's start the pad 16:17:57 <ihrachys> #link https://etherpad.openstack.org/p/neutron-zuulv3-grievances Etherpad for zuulv3 grievances 16:18:20 <boden> why not just add to https://etherpad.openstack.org/p/zuulv3-migration-faq so everyone knows the issues 16:18:26 <boden> other people might have similar problems 16:18:35 <boden> other people = other projects 16:18:47 <ihrachys> I think it makes sense for ourselves to understand what's the fallout and then cross-match with what they alredy track 16:18:55 <boden> cool 16:18:57 <mlavalle> yeah 16:19:03 <ihrachys> I don't intend to have it forever, just to classify and pass over 16:19:43 <ihrachys> we have neutron, -lib, and client + stable branches to classify 16:20:08 <armax> what about we have a liasion on each of these areas, tasked to report a status of the gate by EOB? 16:20:12 <boden> FYI: I did land this patch to try and fix a lib gate job: https://review.openstack.org/#/c/508945/ 16:20:16 <ihrachys> we could split those right now and work the next 24h on getting the full picture? 16:20:23 <armax> then perhaps we can have a ad hoc sync-up tomorrow to see where we are? 16:20:26 <ihrachys> armax, yeah + 16:20:41 <armax> I can take neutron-lib 16:20:54 <ihrachys> I take stable 16:21:07 <ihrachys> all of them 16:21:12 <armax> OK 16:21:30 <mlavalle> I'll take Neutron 16:21:31 <ihrachys> who's on neutron? 16:21:34 <ihrachys> great 16:21:37 <armax> who wants to take the neutron- and networking- ones? 16:21:46 <ihrachys> team team team team. if you can't work as a team... 16:22:49 <ihrachys> mlavalle, I put your name in the pad 16:22:54 <boden> I dont mind to help, problem is I’m nearly gate illiterate 16:22:55 <mlavalle> ++ 16:22:56 <ihrachys> more volunteers for the rest? 16:23:31 <haleyb> i can look at client 16:23:33 <armax> I’ll have a look at the periodic runs I have time 16:23:41 <armax> haleyb: take it off of me then 16:23:43 <ihrachys> haleyb, check the list of not assigned in the pad 16:23:59 <armax> I added myself but happy to hand it over :) 16:24:12 * haleyb takes a step back :) 16:24:34 <armax> chicken 16:24:38 <armax> :) 16:24:42 <ihrachys> ok. if nothing else, I am not too nervous about networking- / neutron- / periodic at this point 16:24:53 <ihrachys> it's on subteams (except periodic that doesn't block) 16:25:03 <armax> so for now we don’t worry about the non-voting jobs, right? 16:25:06 <ihrachys> thanks for everyone who is not a chicken 16:25:08 <ihrachys> :) 16:25:19 <ihrachys> armax, yeah, goal is unblock gate 16:25:24 * mlavalle is a chicken but is trying to hide it 16:25:29 <ihrachys> we will review the gate in a week for others 16:25:52 <armax> I think let’s try to figure out the grafana black hole asap 16:25:57 <armax> we can’t fix what we can’t see 16:25:58 <haleyb> we're more like squirrels who all got run over by the zuulv3 bus, and it's backing-up now 16:26:25 <haleyb> armax: i was going to look at grafana, most likely just needs update to job names 16:26:43 <armax> haleyb: true, but we’ll have to move away from the legacy- prefix sooner or later 16:26:59 <ihrachys> that's after we are back on our legs 16:27:00 * jlibosva is also chicken 16:27:02 <armax> unless we want to create a legacy dashboard 16:27:20 <armax> either way 16:27:32 <haleyb> armax: i hope we don't have to do that 16:27:41 <ihrachys> I don't think infra in a position to force the switch through further 16:27:47 <armax> great, I get a status.json: Proxy Error when looking at http://zuulv3.openstack.org/ 16:28:09 <armax> ihrachys: agreed, but I loathe to see ‘legacy’ everywhere :) 16:28:12 <ihrachys> haleyb, I don't think that's realistic hope 16:28:35 <haleyb> ihrachys: you mean not having a legacy dash? 16:28:40 <armax> I hope the translation is pretty straightforward 16:28:43 <armax> yeah 16:28:43 <ihrachys> haleyb, not having to do it 16:28:54 <ihrachys> armax, it's switch to ansible man 16:29:07 <ihrachys> well, we could call a bash script from there I guess? 16:29:22 <armax> ihrachys: rock. on. 16:29:27 <haleyb> i will see what other dashboard changes landed recently i guess 16:29:41 <ihrachys> would probably make sense to move definitions as scripts into neutron tree; then work on ansible switch as needed. 16:30:10 <ihrachys> so excited of all the productive work we are about to do 16:30:15 <ihrachys> meh 16:31:04 <ihrachys> I don't think we have anything to discuss for grafana or gate without data, so let's move on 16:31:06 <ihrachys> #topic Bugs 16:31:16 <ihrachys> https://bugs.launchpad.net/neutron/+bugs?field.tag=gate-failure 16:31:44 <ihrachys> I don't see anything new in the list except the iptables apply issue that haleyb will look at 16:31:55 <ihrachys> so we can focus on gate for the most part. 16:32:04 <mlavalle> great 16:32:33 <ihrachys> #topic Fullstack 16:32:56 <ihrachys> I am not sure there is a lot of reason to discuss fullstack or scenarios at the point where we are right now. thoughts? 16:33:09 <mlavalle> agree 16:33:27 <ihrachys> ok 16:33:32 <armax> it’s like discussing that the house is dirty when the roof is on fire 16:33:39 <ihrachys> #topic Open discussion 16:33:54 <ihrachys> anything critical that is more important than putting the fire off to discuss? 16:34:07 <armax> back to the zuulv3 topic any idea what to do with node_failure errors? 16:34:20 <armax> it seems like infra is fighting stability issues of their own? 16:34:45 <boden> armax; yes, that’s what I said earlier… the infra isn’t even stable enough to test the gate jobs now 16:35:11 <ihrachys> armax, maybe ask them about whether they have it fixed, or when it's going to be fixed 16:35:15 <mlavalle> so maybe talking to them, to get a sense as to where they stand 16:35:34 <ihrachys> maybe they will tell us straight they revert and we don't need to do the work in the first place :) 16:35:36 <armax> OK, I am going to learn a bit about this today so that I can ask intelligent questions 16:36:02 <mlavalle> ihrachys: that's a nice dream 16:36:21 <ihrachys> mlavalle, not really, only means you will go through another round of pain in the future 16:36:32 <mlavalle> that's true 16:36:36 <ihrachys> ok, thanks everyone, let's follow up with infra, and classify. that will be something already. 16:36:39 <ihrachys> keep up 16:36:43 <ihrachys> #endmeeting