16:00:34 #startmeeting neutron_ci 16:00:34 Meeting started Tue Aug 28 16:00:34 2018 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:35 liuyulong: thanks for attending. Have a great night! 16:00:36 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:38 hello again :) 16:00:39 o/ 16:00:39 The meeting name has been set to 'neutron_ci' 16:00:41 o/ 16:00:58 last leg of "Meetings Tuesday" 16:01:00 prior o/ was for CI .. lol 16:01:17 :) 16:01:28 short info 16:01:38 hi 16:01:40 I need to finish this meeting in 45 minutes 16:02:00 so mlavalle You will continue it after that or we will finish 15 minutes earlier today 16:02:07 ok for You? 16:02:17 let's finish 15 min earlier 16:02:27 fine :) 16:02:31 #topic Actions from previous meetings 16:02:37 * njohnston to tweak stable branches dashboards 16:02:45 https://review.openstack.org/#/c/597168/ 16:03:00 this brings the stable dashboards in line with your reformat of the main one 16:03:17 but with jobs dropped if they don't exist for the stable rbanches, like all the scenario jobs 16:03:22 njohnston: thx, I will review it soon 16:03:40 and I bump the versions, so old dashboard == stable/rocky and older == stable/queens now 16:04:17 that's good :) 16:04:31 thx njohnston 16:04:39 ok, next one was: 16:04:41 * slaweq to update grafana dashboard to ((FAILURE + TIME_OUT) / (FAILURE + TIME_OUT + SUCCESS)) 16:04:46 Patch: https://review.openstack.org/595763 16:04:46 there were 9 jobs I excluded from the file because Graphite has no record of them for stable branches. Not sure if that is interesting info, but I have the list if you like 16:05:12 sure, we can check this list together 16:06:35 I put the list in acomment on https://review.openstack.org/#/c/597168 16:07:09 ok, thx 16:07:17 I will check it also 16:07:31 thx for working on this njohnston 16:07:37 np 16:08:13 so getting back to next action, which was:* slaweq to update grafana dashboard to ((FAILURE + TIME_OUT) / (FAILURE + TIME_OUT + SUCCESS)) 16:08:31 I sent patch, frickler found one issue there so I need to fix it 16:08:51 moving on to next action 16:08:56 * mlavalle to check neutron_tempest_plugin.scenario.test_dns_integration.DNSIntegrationTests.test_server_with_fip issue 16:09:03 I did 16:09:46 I found that the Nova API never reports that the instance got ip addresses in port 16:10:09 Neutron communicates the vif plugged in event correctly to compute 16:10:21 but I wonder why it happens so often in this job recently and not in another 16:10:29 or maybe You spotted it in different jobs also? 16:10:34 no 16:10:40 that is agood question 16:10:50 I will take a look again with other case 16:11:21 I left a good kibana query in the bug 16:11:27 so it is easy to find other cases 16:11:28 I remember from my previous company that we had such issues somewhere around Juno IIRC 16:11:55 and we even patched nova-compute to check status of port directly in neutron before set instance in ERROR 16:12:10 but later I never saw it in newer versions 16:12:31 maybe in this scenario there is different rabbitmq used or something like that 16:12:32 I'll check another case 16:12:46 mlavalle: ok, thx 16:13:08 #action mlavalle to check another cases of failing neutron_tempest_plugin.scenario.test_dns_integration.DNSIntegrationTests.test_server_with_fip test 16:13:55 ok, that's all for actions from previous week 16:14:03 #topic Grafana 16:14:08 http://grafana.openstack.org/dashboard/db/neutron-failure-rate 16:15:43 do You see anything You want to talk at first? 16:15:55 I'll follow your lead 16:16:03 there is again problem with Neutron-tempest-plugin-dvr-multinode-scenario 16:16:13 it's 100% failing since few days 16:16:26 I found that it's always issue with neutron_tempest_plugin.scenario.test_migration.NetworkMigrationFromHA 16:16:34 like: * http://logs.openstack.org/37/382037/71/check/neutron-tempest-plugin-dvr-multinode-scenario/605ed17/logs/testr_results.html.gz 16:16:59 I reported bug for that https://bugs.launchpad.net/neutron/+bug/1789434 today 16:17:00 Launchpad bug 1789434 in neutron "neutron_tempest_plugin.scenario.test_migration.NetworkMigrationFromHA failing 100% times" [High,Confirmed] 16:17:33 it looks like related somehow to my patch: https://review.openstack.org/#/c/589410/ 16:17:57 but this test was fine on this patch, so something happend later probably that it's failing now 16:18:48 any volunteer to check that? 16:18:53 o/ 16:19:10 thx mlavalle :) 16:19:29 <5 seconds to volunteer :) 16:19:42 haleyb: do you want it? 16:19:51 #action mlavalle to check failing router migration from DVR tests 16:20:12 should I assign it to haleyb? 16:20:23 not unless he explictely wants it 16:20:32 otherwise, I take it 16:20:38 mlavalle: no, feel free 16:20:47 haleyb: ack 16:20:50 ok, so we have the winner :) 16:20:52 thx mlavalle 16:20:57 yaaay!!!! 16:21:02 I wan!!!! 16:21:02 LOl 16:21:53 ok, so let's continue about scenario jobs then 16:22:20 other scenario job which is failing quite often is this designate job which we already talk about 16:22:39 and I also found couple of times timeouts in neutron-tempest-plugin-scenario-linuxbridge 16:22:47 * http://logs.openstack.org/59/596959/1/check/neutron-tempest-plugin-scenario-linuxbridge/f62f1c6/job-output.txt.gz 16:22:49 * http://logs.openstack.org/18/591818/3/check/neutron-tempest-plugin-scenario-linuxbridge/212183e/job-output.txt.gz 16:22:51 * http://logs.openstack.org/34/596634/1/check/neutron-tempest-plugin-scenario-linuxbridge/098b6f3/job-output.txt.gz 16:26:09 there is virt_type=kvm set but it shouldn't be problem if it's supported by host 16:26:50 is there any voluneer to check why there there are such timeouts? 16:27:06 if no, I will report it as a bug and take a look when I will have some time 16:27:14 but currently I'm quite overloaded :/ 16:28:40 I am also a bit overloaded 16:29:03 if nobody else steps up and you are patient with me, then sign me up 16:29:11 mlavalle: thx 16:29:23 I might not get to it until next week 16:29:30 I will report it as a bug for now and we will see who will have some cycles to check that 16:29:35 fine for You? 16:30:00 yes, report as bug and i can look at logs at least 16:30:09 #action slaweq to report a bug about timouts in neutron-tempest-plugin-scenario-linuxbridge 16:30:12 haleyb: thx 16:30:30 I will send link to bug report when I will report it 16:30:34 or just recheck, recheck, recheck... 16:30:53 yes, for now it's kind of workaround but... :) 16:31:04 sounds good slaweq 16:31:16 thx guys 16:31:21 ok 16:31:36 from scenario jobs it were most often failures which I found last week 16:31:54 so let's move on 16:31:59 #topic functional 16:32:19 FYI: We have fixes almost merged for failing functional tests in stable branches, bug: https://bugs.launchpad.net/neutron/+bug/1788185 16:32:19 Launchpad bug 1788185 in neutron "[Stable/Queens] Functional tests neutron.tests.functional.agent.l3.test_ha_router failing 100% times " [Critical,Confirmed] - Assigned to Miguel Lavalle (minsel) 16:32:42 basically this issue is caused by keepalived in version which is now in Xenial repo 16:33:29 I was talking today with frickler and coreycb about it 16:33:47 I will try to prepare some small reproducer and add it to keepalived bug report 16:34:00 but we should be good using older version of keepalived for now 16:34:14 ok 16:34:28 we merged the patches yesterday, right? 16:34:42 for Queens it's merged 16:34:51 for Pike and Ocata I have to recheck it 16:35:08 ok 16:35:40 other issue with functional tests is, that we Still from time to time we hit: https://bugs.launchpad.net/neutron/+bug/1784836 16:35:40 Launchpad bug 1784836 in neutron "Functional tests from neutron.tests.functional.db.migrations fails randomly" [Medium,Confirmed] 16:35:48 e.g. in http://logs.openstack.org/18/591818/3/check/neutron-functional/ddb3327/logs/testr_results.html.gz 16:36:18 it's not very often but maybe someone with good db experience could take a look at it 16:37:03 mlavalle: do You know who we can potentially ask for look at this one? 16:37:20 I would ask Mike Bayer 16:37:28 thx mlavalle 16:37:55 #action mlavalle to ask Mike Bayer about functional db migration tests failures 16:38:24 anything else to add related to functional tests? 16:38:30 not from me 16:39:00 ok, so let's move to next topic then 16:39:05 #topic Fullstack 16:39:21 speaking about fullstack, I have only one thing 16:39:31 we still have quite lot of failure but all (or almost) are caused by this https://bugs.launchpad.net/neutron/+bug/1779328 and I still don’t know why it happens 16:39:31 Launchpad bug 1779328 in neutron "Fullstack tests neutron.tests.fullstack.test_securitygroup.TestSecurityGroupsSameNetwork fails" [High,In progress] - Assigned to Slawek Kaplonski (slaweq) 16:39:56 maybe we should mark this test as unstable again to make our life easier? 16:40:05 what You think about it? 16:40:35 yesh, let's do it 16:40:46 ok, I will do it then 16:40:51 do we have a rule in elastic-recheck for that failure? 16:41:10 #action slaweq to mark fullstack security group test as unstable again 16:41:15 njohnston: I don't think so 16:42:21 njohnston: do You think we should add it there also? 16:42:38 if we mark it as unstable, it will be skipped simply instead of failing 16:42:44 yeah 16:43:41 I never think it's a bad idea to look at elastic-recheck rules, but then again it is hard to find cores to +2 them these days :-) 16:44:08 yeah, so maybe let's just mark it in our repo as unstable for now 16:44:19 ok, that was all what I had for today 16:44:30 and I need to end this meeting right now :) 16:44:35 perfect timing 16:44:38 Thanks! 16:44:41 thx guys for attending 16:44:45 thanks! 16:44:46 and see You next week 16:44:47 o/ 16:44:51 #endmeeting