16:02:30 #startmeeting neutron_ci 16:02:31 Meeting started Tue Feb 6 16:02:30 2018 UTC and is due to finish in 60 minutes. The chair is ihrachys. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:02:32 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:02:35 The meeting name has been set to 'neutron_ci' 16:02:38 o/ 16:02:43 o/ 16:02:49 hi 16:03:16 thanks jlibosva for taking over the chair the prev meeting 16:03:23 afaiu Jakub won't join us today 16:03:23 ++ 16:03:32 #topic Actions from prev meeting 16:03:37 "jlibosva to request a new release for ovsdbapp" 16:03:39 yes, he dropped of about 15 minutes ago 16:04:02 there is https://review.openstack.org/541056 and https://review.openstack.org/541112 in pipeline that should roll in new library into gates 16:04:03 patch 541056 - requirements - update constraint for ovsdbapp to new release 0.9.1 16:04:04 patch 541112 - requirements (stable/pike) - update constraint for ovsdbapp to new release 0.4.2 16:05:40 "mlavalle and haleyb to follow up on how we can move forward floating ip failures in dvr scenario" 16:06:25 haleyb is setting up an environment to replicate the issue. 16:06:50 since we marked the test as unstable 16:06:52 ihrachys: my only update is i have an environment but haven't replicated yet 16:09:57 ok. I guess there is not much sense to roll the AI over and over going forward. I trust you to beat it to death. :) 16:10:16 next was "jlibosva to report bug about scenario failure for test_snat_external_ip" 16:11:36 not sure it happened, can't find anything in gate-failure tagged bugs 16:13:19 I sent him an email about it, we'll see if he followed up on that 16:13:29 ++ 16:13:36 ok that's all we had from prev meeting 16:13:44 #topic Grafana 16:13:48 http://grafana.openstack.org/dashboard/db/neutron-failure-rate 16:14:12 before we walk through regular offenders, let's peek at periodics 16:14:39 mlavalle, afaiu https://review.openstack.org/#/c/539006 would move the jobs to neutron tree, then we would change names for jobs / clean up legacy jobs from infra repos? 16:14:40 patch 539006 - neutron - Move periodic jobs to Neutron repo 16:14:54 correct 16:15:16 there seems to be some syntax error or smth. 16:15:28 ok that's covered then 16:15:29 I introduced a syntax error yesterday. I'll fix it today 16:16:07 in the first revision we had a funtional periodic job as well 16:16:20 but is seems it is the same as the normal functional 16:16:32 so I removed it yesterday 16:16:56 and when we add the other two to the periodic queue, we will also add the normal functional 16:19:04 ok. -pg- (postgres) job of those periodic jobs shows some level of instability 16:19:20 but it's not clear if it's totally broken, since some days are fine 16:20:27 I checked latest logs and it seems something is busted 16:20:27 http://logs.openstack.org/periodic/git.openstack.org/openstack/neutron/master/legacy-periodic-tempest-dsvm-neutron-pg-full/be7f73e/logs/devstacklog.txt.gz#_2018-02-06_06_24_58_783 16:20:33 "Rolling upgrades are currently supported only for MySQL and Sqlite" 16:20:41 and that's from glance section of devstack 16:21:29 I will report a bug for that and assign it to glance 16:21:41 #action ihrachys report bug for -pg- failure to glance 16:22:07 fullstack is unstable, that's expected 16:22:21 functional is a tad better and I expect it to become a lot better when new ovsdbapp rolls in 16:22:23 but it's better than last week at least 16:22:25 :) 16:22:33 (actually, it may also help fullstack) 16:22:44 slaweq, absolutely! 16:23:25 one thing of concern esp. close to releas is that ovsfw job, though of course non-voting, now at 100% 16:23:40 anyone aware of the reason for breakage / bug to look at? 16:24:39 e.g. here: http://logs.openstack.org/54/537654/4/check/neutron-tempest-ovsfw/5c90b2b/logs/testr_results.html.gz IMO some error with SSH to vm 16:25:08 I can report a bug and try to check it this week if You want 16:25:16 *in this week 16:25:35 and based on what guests log, metadata is not accessible 16:26:54 http://logs.openstack.org/54/537654/4/check/neutron-tempest-ovsfw/5c90b2b/logs/screen-q-agt.txt.gz?level=ERROR 16:26:59 slaweq, well if you have cycles. jlibosva may be of help here. 16:27:07 yeah was going to post same link to agent log 16:27:12 it doesn't look healthy 16:27:18 there are some errors related to ovsfw but I don't know if it is a reason 16:27:24 ofport: -1 for VIF: bdfda7ad-c656-46a3-bd83-a17878346c35 is not a positive integer 16:28:14 slaweq, it's probably worth marking the bug as blocker for release. mlavalle what do you think? 16:28:44 this sea of red is not something we should release :) 16:28:51 yeah, I think it is a good idea 16:29:02 ihrachys: I have some cycles so I will check it ASAP 16:29:15 slaweq, great. make sure the bug is tagged for queens-rc1 16:29:23 sure 16:29:38 mlavalle, btw when is stable/queens cutoff / rc1? 16:29:53 rc1 is this coming Friday 16:30:02 I wonder if our release liaison started to walk through pre-release check list. 16:30:18 probably not a q for this venue though, but smth to follow up with armax if not already, mlavalle. 16:30:33 yeap 16:30:42 but he's been active 16:30:48 ok so ovsfw is covered too, great 16:31:11 #action slaweq to report bug for ovsfw job failure / sea of red in ovs agent logs 16:32:25 as for scenarios, linuxbridge is quite bad, but dvr one actually seems to mostly match trend spikes and dips of other tempest jobs, just a bit exaggerated 16:33:13 so there are probably some more issues somewhere there. we'll have a look right now on logs. 16:34:20 actually, dvr is not that much better, it's just some spike in linuxbridge lately that has not reflected on dvr, but the shape for older period is same. 16:34:45 #topic Scenarios 16:35:37 linuxbridge: http://logs.openstack.org/54/537654/4/check/neutron-tempest-plugin-scenario-linuxbridge/3272a19/logs/testr_results.html.gz 16:35:45 ssh to fip failed 16:37:04 there is some warning in q-agt logged over and over: http://logs.openstack.org/54/537654/4/check/neutron-tempest-plugin-scenario-linuxbridge/3272a19/logs/screen-q-agt.txt.gz?level=WARNING 16:37:18 haleyb, any idea where that one comes from and whether it's harmful? 16:37:53 based on table name, neutron-linuxbri-qos-o15fb2c, it's probably smth qos related 16:37:53 ihrachys: i don't know where it's coming from but don't think it's harmful 16:40:03 it could be something as simple as removing the rule with a different string than it was added 16:40:35 it's strange because AFAIR qos is adding rules to MANGLE table only (but I might be wrong) 16:40:48 right. but then wouldn't we e.g. potentially leave an old? 16:41:11 actually it might be MANGLE table and POSTROUTING chain there 16:43:07 here is another run: http://logs.openstack.org/68/540868/1/check/neutron-tempest-plugin-scenario-linuxbridge/01c4559/logs/testr_results.html.gz 16:43:25 also failing on ssh to fip, but different tests 16:43:40 looks like some random issues that are not specific to scenario, just ssh to fip 16:43:44 slaweq: maybe because the chain removal triggered the rule removal? 16:44:10 haleyb: maybe, I really don't know now 16:44:37 it's the postrouting rule in the mangle table 16:44:52 but a successful run has same messages: http://logs.openstack.org/57/523257/33/check/neutron-tempest-plugin-scenario-linuxbridge/35864c5/logs/screen-q-agt.txt.gz?level=WARNING so it's probably not the root cause 16:45:38 actually mangle table rule in the postrouting table, that's a mouthful 16:45:54 we will probably need to track through things like - whether FIP was reused; whether gARP was sent; whether ARP table was updated.. 16:46:23 I will bite that one 16:46:39 #action ihrachys to look at linuxbridge scenario random failures when sshing to FIP 16:48:20 also, dvr scenarios fail from time to time 16:48:21 http://logs.openstack.org/57/523257/33/check/neutron-tempest-plugin-dvr-multinode-scenario/b907462/logs/testr_results.html.gz 16:48:28 similar symptoms actually 16:49:05 I think that it's the issue which haleyb and mlavalle was talking at the beginning (in "actions from previous meeting") 16:49:23 oh fip one? 16:49:30 but I might be wrong 16:49:37 I also see this: http://logs.openstack.org/57/523257/33/check/neutron-tempest-plugin-dvr-multinode-scenario/b907462/logs/screen-q-agt.txt.gz?level=WARNING#_Feb_03_18_50_36_283366 that seems to resemble ovsfw- job failure 16:50:35 but not sure, could as well be unrelated 16:50:57 the issue haleyb was trying to reproduce is for specific test cases that are disabled no? 16:51:18 yes 16:52:50 ok I am not sure what to do with this failure. maybe allow it to slip since haleyb is already busy with another issue for the job related to FIPs. 16:54:39 was that just a warning or failure? 16:54:56 well it fails test cases with ssh connection issues when used with FIP 16:54:57 oh, ERROR 16:55:03 http://logs.openstack.org/57/523257/33/check/neutron-tempest-plugin-dvr-multinode-scenario/b907462/logs/testr_results.html.gz 16:55:14 it broke lots of tests 16:55:18 ah, i saw WARNING in the url 16:55:44 well yeah WARNINGs are just things we look at in hope they reveal the cause. but it's a legit failure. 16:56:18 anyhow. I guess we will wait for more progress on that other FIP issue. 16:56:24 #topic Fullstack 16:56:29 not much time but let's peek 16:57:08 there is still one issue with security group tests: https://bugs.launchpad.net/neutron/+bug/1744402 16:57:09 Launchpad bug 1744402 in neutron "fullstack security groups test fails because ncat process don't starts" [High,Confirmed] - Assigned to Slawek Kaplonski (slaweq) 16:57:18 http://logs.openstack.org/14/531414/1/check/neutron-fullstack/55e82ca/logs/testr_results.html.gz 16:57:48 slaweq, gotcha. fix here: https://review.openstack.org/#/c/541242/ 16:57:48 patch 541242 - neutron - [Fullstack] Mark security group test as unstable 16:57:57 oh it's just mark as unstable 16:58:13 ihrachys: it's not a fix but related to this issue :) 16:58:27 ok 16:58:28 I didn't have time yet to debug it 16:59:09 btw, before we wrap up 16:59:25 I noticed a colleague from ironic posted this revert for ovsfw patch: https://review.openstack.org/#/c/541297/1 16:59:25 patch 541297 - neutron - DNM Test Revert "ovsfw: Don't create rules if upda... 16:59:33 maybe they are onto something 16:59:51 it would make sense to check with them what they try to do 17:00:27 ok time is out 17:00:35 thanks folks 17:00:37 #endmeeting