15:00:34 <slaweq> #startmeeting neutron_ci 15:00:34 <openstack> Meeting started Wed Jul 8 15:00:34 2020 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:35 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:37 <slaweq> hi 15:00:37 <openstack> The meeting name has been set to 'neutron_ci' 15:00:52 <ralonsoh> hi 15:01:02 <njohnston> o/ 15:01:46 <bcafarel> o/ sorry somehow my irc client now had #openstack-meeting3 (without "-") 15:02:01 <maciejjozefczyk> \o 15:02:49 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:02:59 <slaweq> please open link and we can start :) 15:03:06 <slaweq> #topic Actions from previous meetings 15:03:13 <slaweq> ralonsoh will check get_datapath_id issues in https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_63a/711425/11/check/neutron-functional/63ac4ca/testr_results.html 15:03:31 <ralonsoh> I didn't find the root cause of the timeout 15:03:44 <ralonsoh> but I've pushed a patch for ovsdbapp 15:03:58 <ralonsoh> to add a bit more information when the txn fails 15:04:08 <ralonsoh> if that depends on the RXN or the TXN queue 15:04:33 <ralonsoh> sorry for that... it's not easy to track those errors 15:05:09 <slaweq> ok, so it's WIP 15:05:12 <slaweq> thx ralonsoh 15:05:18 <ralonsoh> yes 15:05:29 <slaweq> please keep us updated if You will find anything 15:05:32 <ralonsoh> sure 15:05:40 <slaweq> btw. do we have LP for that? 15:05:45 <ralonsoh> no 15:05:53 <ralonsoh> just a patch in gerrit 15:06:10 <ralonsoh> If I find another error like this one, I'll open a LP bug 15:06:16 <slaweq> ok, thx 15:06:23 <slaweq> ok, next one 15:06:25 <slaweq> slaweq will check errors with non existing interface in https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_63a/711425/11/check/neutron-functional/63ac4ca/testr_results.html 15:06:32 <slaweq> I did not have time for that one 15:06:44 <slaweq> and logs from the link above are already gone 15:07:05 <slaweq> as I didn't saw it later, I will not bother with it anymore for now 15:07:18 <slaweq> if it will happen again, I will try to check it 15:07:29 <slaweq> next one 15:07:31 <slaweq> slaweq to move non-voting jobs to experimental queue in EM branches 15:07:37 <slaweq> Rocky: https://review.opendev.org/#/c/739668/ 15:07:39 <slaweq> Queens: https://review.opendev.org/#/c/739669/ 15:07:41 <slaweq> Pike: https://review.opendev.org/#/c/739910/ (thx bcafarell) 15:07:55 <slaweq> and one more question - what about Ocata? 15:08:04 <slaweq> do we still support this branch? 15:08:15 <slaweq> I think so but I couldn't find any open patch for it 15:08:22 <ralonsoh> that was set as EOL 15:08:26 <ralonsoh> if I'm not wrong 15:08:30 <bcafarel> I am looking for link I think it is EOL now 15:09:19 <njohnston> EOL IIRC 15:09:24 <bcafarel> ah "Moving the stable/ocata to 'Unmaintained' phase and then EOL" 15:09:42 <slaweq> I don't see ocata-eol tag in the repo 15:09:57 <bcafarel> maybe the paperwork is not fully completed for https://releases.openstack.org/ (and tags) 15:10:21 <bcafarel> but at least on neutron side we did not see recent backport requests 15:10:44 <njohnston> as I recall there was something about how individual projects can EOL it at their individual discretion 15:10:52 <slaweq> ok, I will check its state after I will come back from the PTO 15:11:36 <slaweq> ok, next one 15:11:37 <slaweq> slaweq to investigate failures in test_create_router_set_gateway_with_fixed_ip test 15:11:43 <slaweq> I didn't have time for that one too 15:11:58 <slaweq> it's not very urgent as it happens in non-voting job 15:12:11 <slaweq> but if someone wants to check, feel free to take it ;) 15:12:27 <ralonsoh> I'll ping you tomorrow about this one 15:12:33 <slaweq> ralonsoh: sure, thx 15:12:40 <slaweq> ok, next one 15:12:42 <slaweq> maciejjozefczyk to check neutron_tempest_plugin.scenario.test_connectivity.NetworkConnectivityTest.test_connectivity_through_2_routers failure in ovn jobs 15:13:07 <maciejjozefczyk> I havent time to check that one ;/ I'm going to take a look tomorrow 15:14:28 <slaweq> ok 15:14:36 <slaweq> I will assign it to You for next week 15:14:40 <slaweq> #action maciejjozefczyk to check neutron_tempest_plugin.scenario.test_connectivity.NetworkConnectivityTest.test_connectivity_through_2_routers failure in ovn jobs 15:14:58 <ralonsoh> (busy week for everyone) 15:15:02 <slaweq> yeah 15:15:07 <slaweq> as usual 15:15:21 <slaweq> lets move to the next topic 15:15:23 <slaweq> #topic Stadium projects 15:15:29 <slaweq> zuul v3 migration 15:15:51 <slaweq> I finally pushed patch for last missing neutron job: https://review.opendev.org/729591 - lets see how it will work 15:16:09 <ralonsoh> cool! 15:16:16 <slaweq> it still didn't even started 15:16:27 <slaweq> but I will keep an eye on it 15:16:31 <bcafarel> it will before your PTO :) 15:16:35 <ralonsoh> hehehe 15:16:56 <slaweq> bcafarel: I hope so 15:16:58 <slaweq> :) 15:17:09 <slaweq> ok, anything else regardig stadium for today? 15:19:52 <slaweq> ok, I guess it means "no" 15:19:56 <slaweq> so lets move on 15:19:58 <slaweq> #topic Switch to Ubuntu Focal 15:20:01 <slaweq> any updates on that? 15:20:21 <ralonsoh> devstack patch was merged 15:20:22 <ralonsoh> https://review.opendev.org/#/c/704831/ 15:20:41 <ralonsoh> but https://review.opendev.org/#/c/734304 is failing 15:20:59 <ralonsoh> I need to review the logs for FT/fullstack 15:21:33 <maciejjozefczyk> we need this one first to land https://review.opendev.org/#/c/737984/ 15:22:05 <bcafarel> maciejjozefczyk: will it fix errors like those seen in https://review.opendev.org/#/c/738163/ ? 15:22:07 <bcafarel> Exception: Invalid directories: /usr/local/share/ovn, /usr/share/ovn, /usr/local/share/openvswitch, /usr/share/openvswitch, None 15:22:52 <bcafarel> (some other jobs failing too, but I did not have time to dig further) 15:23:16 <ralonsoh> maciejjozefczyk, but apart from your patch, FTs/fullstack are failing because in Focal python-openvswitch does not exist 15:23:17 <ralonsoh> E: Unable to locate package python-openvswitch 15:23:31 <slaweq> :/ 15:23:45 <slaweq> should we ask someone from Ubuntu team for help with that? 15:24:04 <ralonsoh> we should, yes 15:24:07 <maciejjozefczyk> ahh 15:24:14 <maciejjozefczyk> python3-? 15:24:28 <ralonsoh> maciejjozefczyk, I'll check that today 15:24:37 <bcafarel> ralonsoh: I have a quick hack to pull python3- for test in 738163 (hardcoded just for testing) 15:26:12 <slaweq> ok, so package name was renamed to python3-openvswitch in Focal? Is that correct? 15:26:31 <ralonsoh> I need to check that 15:26:40 <bcafarel> yup 15:26:46 <slaweq> ok, thx ralonsoh and bcafarel for working on this 15:27:26 <slaweq> I think we can move on to the next topic 15:27:29 <slaweq> #topic Stable branches 15:27:35 <slaweq> Ussuri dashboard: http://grafana.openstack.org/d/pM54U-Kiz/neutron-failure-rate-previous-stable-release?orgId=1 15:27:37 <slaweq> Train dashboard: http://grafana.openstack.org/d/dCFVU-Kik/neutron-failure-rate-older-stable-release?orgId=1 15:27:51 <bcafarel> stable branches maybe, stable ci not these days 15:28:06 <slaweq> LOL 15:28:21 <maciejjozefczyk> ;/ 15:28:30 <slaweq> bcafarel: so we can say (not so)stable branches 15:28:32 <slaweq> ok? 15:28:34 <slaweq> :D 15:28:41 <bcafarel> :) approved! 15:28:47 <bcafarel> for the previous issues we are almost good now finally (pike https://review.opendev.org/#/c/739456/ waiting on second +2) 15:29:55 <bcafarel> but new pep8 error (isort) needs a requirements backport which failed CI https://review.opendev.org/#/c/739912 15:29:59 <slaweq> bcafarel: +W 15:30:32 <slaweq> and we need same backport all the way to stein, right? 15:30:50 <bcafarel> apparently yes 15:30:56 <slaweq> :/ 15:31:37 <slaweq> at least we are good with pep8 issue in master now :) 15:31:49 <ralonsoh> what are those errors in py36 tests? 15:31:58 <ralonsoh> https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_513/739912/1/check/openstack-tox-py36/51376c8/testr_results.html 15:32:39 <bcafarel> yes I am not sure where they come from (but appear in both py36/py37 and all other backports) 15:33:53 <ralonsoh> please, keep an eye on https://github.com/PyCQA/pylint/issues/3722 15:34:14 <slaweq> maybe yet another ci blocker issue (because we still have not enough) ;) 15:35:16 <slaweq> ok, lets move on 15:35:17 <slaweq> #topic Grafana 15:35:24 <slaweq> #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:36:44 <slaweq> I see one thing which worry me a bit 15:36:53 <slaweq> neutron-ovn-tempest-full-multinode-ovs-master is failing 100% of times since some time 15:37:04 <slaweq> I opened LP for that https://bugs.launchpad.net/neutron/+bug/1886807 15:37:04 <openstack> Launchpad bug 1886807 in neutron "neutron-ovn-tempest-full-multinode-ovs-master job is failing 100% times" [High,Confirmed] 15:37:41 <slaweq> idk if that is job configuration or something like that but seems that in many tests it's failing due to ssh issues 15:38:50 <slaweq> other than that, I think that things are more or less ok in master branch now 15:38:56 <maciejjozefczyk> yes thats strange because I don't remember anything that could break this... We also didn't changed the ovn branch 15:38:58 <maciejjozefczyk> yes 15:39:16 <maciejjozefczyk> I'm gonna take a look tomorrow what was pushed to ovn master recently 15:39:28 <slaweq> maciejjozefczyk: at first glance I suspect some issue with infra, it's multinode job and maybe ssh to vms which lands on the second node don't work 15:39:41 <slaweq> that's something what I would try to check first 15:39:45 <maciejjozefczyk> yep 15:40:09 <slaweq> btw https://review.opendev.org/737984 just merged :) 15:40:25 <slaweq> #action maciejjozefczyk to check failing neutron-ovn-tempest-full-multinode-ovs-master job 15:40:34 <slaweq> thx maciejjozefczyk for looking into that 15:40:38 <maciejjozefczyk> ack 15:40:39 <maciejjozefczyk> ;) 15:41:17 <slaweq> ok, lets move on 15:41:19 <slaweq> #topic fullstack/functional 15:41:29 <slaweq> I found one new (for me at least) issue 15:41:34 <slaweq> neutron.tests.functional.test_server.TestWsgiServer.test_restart_wsgi_on_sighup_multiple_workers 15:41:38 <slaweq> https://f7a63aeb9edd557a2176-4740624f0848c8c3257f704064a4516f.ssl.cf2.rackcdn.com/736026/4/gate/neutron-functional/d7d5c47/testr_results.html 15:41:50 <slaweq> did You saw something like that recently? 15:43:20 <bcafarel> not that I remember (and that code was not touched recently) 15:43:39 <ralonsoh> the worker didn't restart, isn't it? 15:43:42 <slaweq> bcafarel: yes, but I saw it I think twice this week 15:44:32 <slaweq> hmm 15:44:39 <slaweq> I see such error in the test logs: https://f7a63aeb9edd557a2176-4740624f0848c8c3257f704064a4516f.ssl.cf2.rackcdn.com/736026/4/gate/neutron-functional/d7d5c47/controller/logs/dsvm-functional-logs/neutron.tests.functional.test_server.TestWsgiServer.test_restart_wsgi_on_sighup_multiple_workers.txt 15:44:56 <slaweq> do You think it may be related? 15:46:39 <slaweq> actually I think I saw something similar in the past 15:46:51 <slaweq> when You look at the test's code: https://github.com/openstack/neutron/blob/master/neutron/tests/functional/test_server.py#L163 15:47:13 <slaweq> it failed on condition which checkes if some specific file was created and had got expected size 15:47:33 <slaweq> so it could be that size was wrong or there was no this file at all 15:47:46 <slaweq> I will change that condition to add some logging to it 15:47:55 <slaweq> so will be easier to check what happend there 15:49:34 <njohnston> +1 15:49:36 <slaweq> #action slaweq to change condition in the TestNeutronServer to have better logging 15:50:09 <slaweq> ok, we can get back to this issue when we will know more what happen there 15:50:18 <bcafarel> +1 15:50:26 <slaweq> lets move on 15:50:28 <slaweq> #topic Tempest/Scenario 15:50:45 <slaweq> I had only this issue related to ovn job but we talked about it already 15:50:54 <slaweq> so just 2 short info 15:51:00 <slaweq> and ask for reviews :) 15:51:40 <slaweq> Please review https://review.opendev.org/#/c/736186/ 15:51:49 <slaweq> I had to rebase it to resolve some conflict there 15:52:01 <slaweq> and also I increased timeouts in singlenode tempest jobs: https://review.opendev.org/739955 15:52:06 <slaweq> please take a look at this one too 15:52:34 <slaweq> and that's all from me about scenario jobs 15:52:44 <slaweq> #topic On demand agenda 15:53:00 <slaweq> do You have anything else You want to talk about today? 15:53:18 <ralonsoh> no thanks 15:53:40 <slaweq> I have one quick item 15:53:52 <slaweq> as You know next 2 weeks I will be on PTO 15:54:01 <maciejjozefczyk> slaweq, enjoy :) 15:54:06 <slaweq> do You want me to cancel this ci meeting or ralonsoh can You chair it? 15:54:17 <ralonsoh> I can (probably) 15:54:32 <slaweq> I'm asking You because You are listed as co-chair of this meeting already :) 15:54:43 <slaweq> ralonsoh: thx a lot 15:55:05 <slaweq> so I will not cancel it, please run it if You will have time or cancel if You want 15:55:15 <slaweq> that's all from my side 15:55:24 <slaweq> thx for attending today 15:55:30 <slaweq> and have a great rest of the week 15:55:33 <ralonsoh> bye 15:55:34 <slaweq> #endmeeting