15:00:38 <slaweq> #startmeeting neutron_ci
15:00:39 <openstack> Meeting started Tue Mar 16 15:00:38 2021 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:40 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:42 <openstack> The meeting name has been set to 'neutron_ci'
15:00:43 <slaweq> hi
15:01:57 <slaweq> ralonsoh: lajoskatona: are You going to attend ci meeting?
15:01:58 <ralonsoh> hi
15:02:26 <lajoskatona> Hi, yes, I am here, just left in some half downstream stuff :-)
15:02:32 <slaweq> great
15:02:35 <slaweq> so lets start
15:02:43 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate
15:02:51 <slaweq> #topic Actions from previous meetings
15:02:56 <slaweq> ralonsoh to try to check how to limit number of logged lines in FT output
15:03:06 <ralonsoh> I'm on it (again)
15:03:24 <ralonsoh> I have two places: the DB and the engine facade
15:03:37 <ralonsoh> we can remove the n-lib condition about .session and ._session
15:03:43 <ralonsoh> about the DB... still checking
15:03:48 <slaweq> by engine facade You mean that "deprecation" warning?
15:03:52 <ralonsoh> I'll push both pachtes
15:03:54 <ralonsoh> yes
15:04:10 <slaweq> https://review.opendev.org/c/openstack/neutron-lib/+/779720\
15:04:16 <slaweq> it's there already
15:04:31 <ralonsoh> and I did that before you, some months ago
15:04:36 <slaweq> but we decided to wait with it after release of wallaby
15:04:45 <ralonsoh> but it wasn't working fine
15:04:46 <slaweq> I know, I missed that Your patch
15:04:56 <slaweq> https://review.opendev.org/c/openstack/neutron/+/779721
15:05:04 <slaweq> I needed also that ^^ patch in neutron
15:05:53 <ralonsoh> ok, so for X then
15:06:29 <slaweq> yes
15:07:00 <slaweq> but I think that this db issue logged many times is more problematic there
15:07:13 <ralonsoh> yes and I'm still checking this
15:07:43 <slaweq> thx
15:07:51 <slaweq> #action ralonsoh to try to check how to limit number of logged lines in FT output
15:08:00 <slaweq> just a reminder for next week :)
15:10:03 <slaweq> ok, next one
15:10:09 <slaweq> jlibosva to fix collecting ovn logs in functional jobs
15:10:48 <slaweq> I just pinged him, maybe he will join
15:11:24 <jlibosva> o/
15:11:48 <slaweq> hi jlibosva
15:11:55 <slaweq> we are checking actions from last week
15:11:59 <slaweq> jlibosva to fix collecting ovn logs in functional jobs
15:12:14 <jlibosva> yeah, so I found out that functional job doesn't run ovn-controller, we have fake chassis there
15:12:36 <jlibosva> as for the LP bug - I still need to investigate, I didn't get to it. My apologies
15:13:06 <slaweq> ok, will You try to check it this week?
15:13:26 <jlibosva> maybe early next week, I'll try before the next ci mtg
15:13:46 <slaweq> ok, thx
15:14:28 <slaweq> jlibosva: just to be sure, it was related to https://bugs.launchpad.net/neutron/+bug/1918266, right?
15:14:29 <openstack> Launchpad bug 1918266 in neutron "Functional test test_gateway_chassis_rebalance failing due to "failed to bind logical router"" [High,Confirmed]
15:14:36 <jlibosva> slaweq: yes
15:14:44 <slaweq> ok
15:14:52 <slaweq> #action jlibosva to check LP https://bugs.launchpad.net/neutron/+bug/1918266
15:15:04 <slaweq> just as a reminder for next meeting :)
15:15:13 <slaweq> and with that I think we can move on
15:15:22 <slaweq> #topic Stadium projects
15:15:32 <slaweq> lajoskatona: anything new regarding ci of stadium?
15:15:44 <lajoskatona> nothing
15:15:54 <lajoskatona> all is quiet
15:15:59 <slaweq> which is good news :)
15:16:02 <slaweq> thx lajoskatona
15:16:12 <slaweq> #topic Stable branches
15:16:55 <slaweq> I didn't had time today to check stable branches ci
15:17:03 <slaweq> do You have anything related to them?
15:17:09 <ralonsoh> no
15:18:08 <slaweq> ok, so next topic
15:18:11 <slaweq> #topic Grafana
15:18:17 <slaweq> #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate
15:20:49 <slaweq> I don't see anything going very bad there
15:21:02 <slaweq> still functional tests are most problematic
15:21:16 <ralonsoh> yes but better than 4 weeks ago
15:21:28 <ralonsoh> we still have some timeout problems
15:21:34 <ralonsoh> but fewer
15:21:37 <slaweq> true
15:21:54 <ralonsoh> (and, I promise, I'm still working on this)
15:22:09 <ralonsoh> but is quite difficult to reproduce it locally
15:22:31 <slaweq> the issue with sql_connection parameter in functional tests?
15:22:37 <ralonsoh> no no
15:22:41 <ralonsoh> the FT timeouts
15:22:45 <slaweq> ahh, true
15:23:00 <ralonsoh> sql problem is almost trivial
15:23:02 <slaweq> that will be hard to reproduce
15:23:53 <slaweq> IMO You can first fix the issue with sql, and it is very likely that timeouts will be gone
15:25:16 <slaweq> and speaking about functional tests
15:25:25 <slaweq> #topic fullstack/functional
15:25:33 <slaweq> I found couple of failures last week
15:25:43 <slaweq> first one
15:25:54 <slaweq> timeout while spawning metadata proxy:
15:25:55 <slaweq> https://b5d154aeef4479021ab9-b7d34d92d3c0fa84052da2d7ba8871be.ssl.cf1.rackcdn.com/777535/1/check/neutron-functional-with-uwsgi/bb88e2f/testr_results.html
15:26:00 <slaweq> https://175b7ab9447c4ad3cb55-7660c6eb3fc520d2639a30e0db226704.ssl.cf2.rackcdn.com/765846/7/gate/neutron-functional-with-uwsgi/0099700/testr_results.html
15:26:03 <slaweq> https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_334/773283/15/check/neutron-functional-with-uwsgi/334549a/testr_results.html
15:26:06 <slaweq> at least 3 times
15:26:11 <slaweq> did You saw it maybe?
15:26:30 <ralonsoh> no
15:26:34 <ralonsoh> but I can check it
15:26:45 <slaweq> thx ralonsoh
15:26:56 <slaweq> do You want me to report LP for that?
15:27:05 <ralonsoh> perfect for me
15:27:07 <slaweq> ok
15:27:14 <slaweq> I will do it just after the meeting
15:27:34 <slaweq> #action ralonsoh to check timeout while spawning metadata proxy in functional tests
15:27:49 <slaweq> next one is timeout while disabling processes:
15:27:55 <slaweq> https://cb02acb070cff5b90918-a2819ce1e1d01d4ba47cf0d26d5585a5.ssl.cf5.rackcdn.com/778993/6/gate/neutron-functional-with-uwsgi/ebebe12/testr_results.html
15:27:57 <slaweq> https://0ee130e4a5a2946be1c4-63ec39ca44f5300255d15df097f93e08.ssl.cf1.rackcdn.com/780054/2/check/neutron-functional-with-uwsgi/bc96480/testr_results.html
15:27:59 <slaweq> https://2593fe2c1676d625fafc-b5f24c1063ad83372f17d0890078a578.ssl.cf5.rackcdn.com/765846/7/check/neutron-functional-with-uwsgi/29b9cdc/testr_results.html
15:28:01 <slaweq> https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_b83/763231/16/check/neutron-functional-with-uwsgi/b836250/testr_results.html
15:28:19 <slaweq> ralonsoh: isn't that "second part" of one of the LPs which You were working on?
15:28:36 <ralonsoh> not really
15:28:42 <ralonsoh> that was related to the ns creation
15:29:05 <slaweq> ok
15:29:09 <slaweq> so that is something new
15:29:10 <ralonsoh> but I've seen this message last week many times
15:29:27 <ralonsoh> what I don't understand is how we can timeout killing a process
15:29:38 <ralonsoh> no process will catch this signal
15:30:13 <slaweq> it is calling privsep
15:30:25 <slaweq> so maybe it's some issue in privsep or how we call that lib
15:30:34 <ralonsoh> yes and uses "kill -9"
15:30:48 <ralonsoh> I tried some time ago to use os.kill
15:30:56 <ralonsoh> but it was a disaster
15:31:51 <ralonsoh> ok, now we have moved almost everything to privsep, I'll try again this patch
15:32:01 <slaweq> IMO it is timeing out in privsep while waiting to response
15:32:12 <ralonsoh> should work the same as cmd "kill" but natively
15:32:19 <ralonsoh> yes
15:32:20 <slaweq> would be good to try
15:32:22 <slaweq> thx
15:32:26 <ralonsoh> perfect then
15:32:32 <ralonsoh> will you open a LP?
15:32:35 <slaweq> #action ralonsoh to try to move to os.kill
15:32:38 <slaweq> yes, I will
15:32:40 <ralonsoh> perfect
15:32:47 <slaweq> thank You
15:33:02 <slaweq> next one
15:33:06 <slaweq> failure in neutron.tests.functional.agent.l3.test_dvr_router.TestDvrRouter.test_dvr_ha_router_failover_without_g
15:33:18 <slaweq> https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_415/779149/2/check/neutron-functional-with-uwsgi/4158175/testr_results.html
15:33:20 <slaweq> https://a1a9878be58cf3e65212-2851babb3178649b1b47ad8e0ca050d1.ssl.cf1.rackcdn.com/780054/2/check/neutron-functional-with-uwsgi/4a7ea78/testr_results.html
15:33:25 <slaweq> I saw it twice
15:33:55 <slaweq> in that case it's waiting for router to switch to "backup"
15:34:11 <slaweq> I can take a look at it
15:34:18 <ralonsoh> thanks
15:34:33 <slaweq> #slaweq to check failing neutron.tests.functional.agent.l3.test_dvr_router.TestDvrRouter.test_dvr_ha_router_failover_without_gw
15:35:18 <slaweq> other failures which I found was single failures so I listed them in https://etherpad.opendev.org/p/neutron-ci-meetings but I don't think we need to check each of them now
15:35:27 <slaweq> lets see if they will happen more often first
15:36:53 <slaweq> ok, I think we can move on
15:37:02 <slaweq> #topic periodic
15:37:12 <slaweq> neutron-lib from master is causing UT failures: https://bugs.launchpad.net/neutron/+bug/1919280
15:37:15 <openstack> Launchpad bug 1919280 in neutron "UT are failing when runs with neutron-lib master" [Critical,Confirmed] - Assigned to Slawek Kaplonski (slaweq)
15:37:15 <slaweq> this is new issue
15:37:20 <slaweq> and ralonsoh is already on it
15:37:25 <ralonsoh> https://review.opendev.org/c/openstack/neutron/+/780802
15:37:43 <ralonsoh> it has a depends-on because neutron-lib 2.10.0 is not in openstack/requirements
15:37:47 <ralonsoh> but should be the same
15:38:15 <ralonsoh> https://zuul.opendev.org/t/openstack/status#780802
15:38:21 <slaweq> depends-on will not work here
15:38:25 <ralonsoh> I still need to check some UTs
15:38:30 <ralonsoh> why?
15:38:39 <slaweq> in UT jobs neutron-lib isn't installed from source
15:38:43 <slaweq> but from pypi
15:38:48 <ralonsoh> upssss
15:38:51 <ralonsoh> ok then
15:39:02 <ralonsoh> now I understand those 4 UTs failing
15:39:13 <ralonsoh> same as when I executed py38 locally
15:39:17 <slaweq> so You need to bump neutron-lib version in requirements repo and in neutron
15:39:18 <ralonsoh> without patching n-lib
15:39:27 <slaweq> and then it will install new version of neutron-lib
15:39:49 <ralonsoh> I'll push the patch for requirements now
15:39:53 <slaweq> thx
15:40:11 <slaweq> please give me link, I will +1 it asap
15:40:22 <ralonsoh> https://review.opendev.org/c/openstack/requirements/+/780860
15:40:23 <lajoskatona> The releases patch is merged, and the u-c change is under review: https://review.opendev.org/c/openstack/requirements/+/780860
15:40:25 <ralonsoh> already there...
15:41:03 <slaweq> ralonsoh: so please update Your patch to bump neutron-lib version in requirements.txt and lower-constraints.txt
15:41:11 <ralonsoh> sure
15:41:18 <slaweq> and You can make Your patch depend-on https://review.opendev.org/c/openstack/requirements/+/780860
15:41:22 <slaweq> then it should works
15:42:19 <slaweq> thx ralonsoh for taking care of it
15:42:37 <slaweq> and that was last thing from me for today
15:42:44 <slaweq> do You have anything else regarding ci?
15:42:58 <ralonsoh> no
15:43:26 <slaweq> so I will give You some minutes back :)
15:43:30 <slaweq> thx for attending the meeting
15:43:33 <slaweq> o/
15:43:34 <slaweq> #endmeeting