15:01:15 #startmeeting neutron_ci 15:01:16 Meeting started Wed Oct 7 15:01:15 2020 UTC and is due to finish in 60 minutes. The chair is slaweq_. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:01:17 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:01:18 hi 15:01:20 The meeting name has been set to 'neutron_ci' 15:01:23 Hi 15:01:45 o/ 15:02:08 hi 15:02:32 Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:02:35 Please open now :) 15:03:27 #topic Actions from previous meetings 15:03:35 bcafarel to update our grafana dashboards for stable branches 15:04:09 in progress, not sent yet (I wanted to check jobs listed there) 15:04:18 ok, thx bcafarel 15:04:24 I will assign it to You for next week 15:04:30 just to remember about it 15:04:33 ok? 15:04:42 sounds good, also to have reviewers if it gets forgotten 15:04:46 #action bcafarel to update our grafana dashboards for stable branches 15:04:53 thx a lot 15:05:05 ok, next one 15:05:07 ralonsoh to report a bug and check failing openstack-tox-py36-with-ovsdbapp-master periodic job 15:05:29 I sent a patch to try to solve it 15:05:31 one sec 15:05:45 (should be on the etherpad) 15:06:30 I don't see it on etherpad 15:06:35 https://review.opendev.org/#/c/755256/ 15:06:57 avoid to monkey patch processutils 15:07:21 well, use the original current_thread _active 15:07:35 but we'll need a new version of oslo.concurrency 15:07:51 and it seems that it helped 15:08:01 at least locally 15:08:12 but I can't say that in the CI 15:08:13 https://zuul.openstack.org/buildset/aa6cb9d44d1a49368494071338c7415e 15:08:16 :) 15:08:18 it helped 15:08:39 ah8hh ok, this is2 another problem 15:08:41 sorry 15:09:01 #link https://review.opendev.org/#/c/749537/ 15:09:04 this is the patch 15:09:06 sorry again 15:09:20 :) 15:09:24 don't need to sorry 15:09:29 good that it's fixed :) 15:09:34 thx ralonsoh 15:09:40 and thx otherwiseguy 15:10:19 ok, so I think we can move on to the next topics 15:10:22 #topic Switch to Ubuntu Focal 15:10:29 Etherpad: https://etherpad.opendev.org/p/neutron-victoria-switch_to_focal 15:10:40 we still have some stadium projects to check/change 15:10:49 but I didn't had time this week 15:10:57 do You have any other updates on that? 15:11:03 no 15:11:50 no 15:12:38 https://review.opendev.org/#/c/754068/ longing for second +2 for sfc :) 15:13:02 else topic:migrate-to-focal list looks good for us 15:13:15 bcafarel: I already gave +2 :) 15:13:21 so I can't help with that one now 15:13:28 ralonsoh: lajoskatona but You can ;) 15:13:32 sure 15:13:43 done :-) 15:14:38 thx 15:14:51 thanks :) 15:15:03 Shall I have a slighly related question, do we need this any more: https://review.opendev.org/755721 ? 15:15:37 lajoskatona: nope 15:15:45 it was an issue with pypi mirror 15:16:08 slaweq_: yeah that's why I asked :-) I abandone it then 15:16:11 and I think ralonsoh fixed it on devstack by capping setuptools version 15:16:21 but that was rejected 15:16:28 the problem was in the pypi server 15:16:35 ralonsoh: ahh, ok 15:16:39 admins talked to pypi folks to solve that 15:16:45 most important is that problem is fixed now :) 15:16:49 yes 15:16:53 thx ralonsoh and lajoskatona for taking care of it :) 15:18:05 ok 15:18:06 no problem 15:18:11 regrading standardize on zuul v3 15:18:37 we merged networking-odl patch https://review.opendev.org/#/c/725647/ 15:18:50 so the last one missing is https://review.opendev.org/#/c/729591/ for neutron 15:19:30 and it just failed again, at least functional tests job: https://40f71fdb4a17c8b8e33a-40a7733116b3138073a0fe5a58665a17.ssl.cf5.rackcdn.com/729591/21/check/neutron-functional-with-uwsgi/aace04f/testr_results.html 15:19:31 which received its fair share of rechecks 15:19:33 :/ 15:20:57 slaweq_, that's the other related problem I was talking this morning 15:21:06 now we don't fail in the OVN method 15:21:17 but in the "old_method" --> L3 plugin 15:21:25 I need to check if this is related 15:21:40 I'll talk to otherwiseguy 15:21:49 ralonsoh: ok 15:22:01 please remember to vote also on the networking-odl backport for stable/victoria: https://review.opendev.org/#/c/756324/ 15:22:38 tosky: I already did 15:22:45 I think we need bcafarel's vote also 15:23:07 yeah, another stable core 15:23:12 or neutron stable core 15:23:37 reviewed and W+1 :) 15:24:09 thx 15:24:47 so I think we can move on to the next topic now 15:24:50 #topic Stable branches 15:25:01 Ussuri dashboard: http://grafana.openstack.org/d/pM54U-Kiz/neutron-failure-rate-previous-stable-release?orgId=1 15:25:04 Train dashboard: http://grafana.openstack.org/d/dCFVU-Kik/neutron-failure-rate-older-stable-release?orgId=1 15:25:48 one thing I remember now on stable dashboards, we will also need a victoria template for neuton-tempest-plugin 15:25:57 and switch neutron stable/victoria to it 15:26:13 bcafarel: yes, true 15:26:18 I will do this template 15:26:29 thx for reminder 15:26:42 #action slaweq to make neutron-tempest-plugin victoria template 15:26:55 np, I remembered when my test dashboard came up empty for them 15:29:52 btw. I have one new issue in stable/train 15:29:54 https://bugs.launchpad.net/neutron/+bug/1898748 15:29:55 Launchpad bug 1898748 in neutron "[stable/train] Creation of the QoS policy takes ages" [Critical,New] 15:30:06 did You saw it already maybe? 15:30:14 no 15:30:26 it seems that it breaks devstack gate for stable/train :/ 15:31:09 I don't think I saw it either 15:31:26 is there anyone who wants to check that maybe? 15:32:16 if not, I will try to check that 15:32:21 I'll try to take a look at this error tomorrow 15:32:29 thx ralonsoh :) 15:33:03 ok, lets move on 15:33:08 #topic Grafana 15:33:13 http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:34:47 IMO worst thing from voting jobs is neutron-functional-with-uwsgi now 15:34:57 and we have couple of issues there 15:35:22 and also most of the ovn based jobs are failing 100% of times 15:36:33 anything else You have regarding grafana in general? 15:36:43 or should we move on to the specific job types? 15:37:37 nothing from me 15:37:49 ok, so lets move on 15:37:57 #topic functional/fullstack 15:38:17 I reported today https://bugs.launchpad.net/neutron/+bug/1898859 15:38:18 Launchpad bug 1898859 in neutron "Functional test neutron.tests.functional.agent.linux.test_keepalived.KeepalivedManagerTestCase.test_keepalived_spawns_conflicting_pid_vrrp_subprocess is failing" [High,Confirmed] 15:38:33 as I saw it at least twice recently 15:38:51 IIRC we already saw it in the past too but I wasn't sure if we have bug reported for that already 15:38:59 related to the ns deletion 15:39:07 https://review.opendev.org/#/c/754938/ 15:39:15 please, review ^^ 15:40:32 ahh, right 15:40:35 now I remember :) 15:40:53 so I will mark https://bugs.launchpad.net/neutron/+bug/1898859 as duplicate of https://bugs.launchpad.net/neutron/+bug/1838793 15:40:55 Launchpad bug 1898859 in neutron "Functional test neutron.tests.functional.agent.linux.test_keepalived.KeepalivedManagerTestCase.test_keepalived_spawns_conflicting_pid_vrrp_subprocess is failing" [High,Confirmed] 15:40:55 I think you can join both LP bugs 15:40:56 Launchpad bug 1838793 in neutron ""KeepalivedManagerTestCase" tests failing during namespace deletion" [High,Confirmed] - Assigned to Rodolfo Alonso (rodolfo-alonso-hernandez) 15:40:58 yes 15:41:23 lajoskatona: can You check that patch from ralonsoh? 15:41:36 I hope it will help us a bit with this functional tests job :) 15:41:59 slaweq_: sure, I cheked it in the past, so has some background :-) 15:42:08 lajoskatona: thx a lot 15:42:31 and for other issues with functional tests I know that ralonsoh told me that he will open LPs 15:42:47 the one related to the agents 15:42:56 test_agent_show 15:45:01 yes, did You report it already? 15:45:23 not yet 15:45:40 I'm still investigating the error 15:45:47 k 15:47:13 ok, lets move on then 15:47:15 #topic Tempest/Scenario 15:47:35 first, I reported today bug: https://bugs.launchpad.net/neutron/+bug/1898862 15:47:37 Launchpad bug 1898862 in neutron "Job neutron-ovn-tempest-ovs-release-ipv6-only is failing 100% of times" [High,Confirmed] 15:48:02 becuase neutron-ovn-tempest-ovs-release-ipv6-only is failing 100% of times and usually (or always even) there is 9 tests failing there 15:48:11 so it's very reproducible 15:48:46 I will try to ping lucasgomes or jlibosva to take a look at that one 15:49:18 there is also ovn related issue https://bugs.launchpad.net/neutron/+bug/1885900 15:49:19 Launchpad bug 1885900 in neutron "test_trunk_subport_lifecycle is failing in ovn based jobs" [Critical,Confirmed] - Assigned to Lucas Alvares Gomes (lucasagomes) 15:49:22 which I saw today again 15:50:33 and we still have some random ssh authentication failures 15:50:37 like e.g. https://3b00945aa0cfe70597e9-73e59f2d88a36c349deccf374592c99f.ssl.cf5.rackcdn.com/755752/3/gate/neutron-tempest-linuxbridge/4bbc7f9/testr_results.html 15:50:43 or https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_807/750166/5/gate/neutron-tempest-plugin-scenario-linuxbridge/8073d8a/testr_results.html 15:51:01 and in those cases there is no any "pattern", like always same tests or always same backend 15:51:06 it happens everywhere 15:51:35 and I tend to think that this is issue which ralonsoh found some time ago in our d/s ci 15:51:45 with paramiko and some race condition 15:51:47 with paramiko 15:51:49 yes 15:51:59 I couldn't reproduce that locally 15:52:08 but some race is there IMO 15:52:28 once paramiko tries to log into a VM without the keys, even when the keys are installed, the SSH connection is not possible 15:52:55 maybe we can try to check console log first to see if ssh key was confiugred already 15:52:59 before ssh to the instance 15:53:42 if that will fail for any reason (e.g. custom guest os which don't log things like cirros), we can always try ssh at the end 15:53:47 as "fallback" option 15:53:53 wdyt? 15:54:20 it worths to try it 15:54:23 we can maybe propose that first in neutron-tempest-plugin 15:54:27 worth a try 15:54:31 and if that will work, then propose to tempest too 15:54:52 ok, I will give it a try 15:55:03 (I was doing the opposite: reviewing the paramiko code) 15:55:13 #action slaweq to propose patch to check console log before ssh to instance 15:55:40 ralonsoh: if You will find issue on paramiko's side, we can always revert workaround from neutron-tempest-plugin :) 15:55:47 of course 15:56:28 ok, I have one more issue related to ovn jobs: https://bugs.launchpad.net/neutron/+bug/1898863 15:56:29 Launchpad bug 1898863 in neutron "OVN based scenario jobs failing 100% of times" [Critical,Confirmed] 15:56:39 did You saw that before? 15:56:57 on dstat?? 15:57:01 yes 15:57:07 but I saw it only on ovn based jobs 15:57:09 :/ 15:57:15 no sorry, that's new to me 15:57:44 ok, anyone wants to take a look at that? 15:58:03 if not than it's also fine for now as it affects "only" non-voting jobs 15:58:44 https://bugs.launchpad.net/ubuntu/+source/dstat/+bug/1866619 15:58:46 Launchpad bug 1866619 in dstat (Ubuntu) "OverflowError when machine suspends and resumes after a longer while" [Undecided,Confirmed] 15:58:52 DistroRelease: Ubuntu 20.04 15:59:23 so we will probably need to disable dstat as temporary workaround 15:59:28 thx ralonsoh 15:59:31 yes 16:00:03 ok 16:00:09 we are out of time today 16:00:14 thx for attending the meeting 16:00:16 o/ 16:00:17 bye! 16:00:19 #endmeeting