15:00:38 <slaweq> #startmeeting neutron_ci 15:00:39 <openstack> Meeting started Tue Mar 16 15:00:38 2021 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:40 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:42 <openstack> The meeting name has been set to 'neutron_ci' 15:00:43 <slaweq> hi 15:01:57 <slaweq> ralonsoh: lajoskatona: are You going to attend ci meeting? 15:01:58 <ralonsoh> hi 15:02:26 <lajoskatona> Hi, yes, I am here, just left in some half downstream stuff :-) 15:02:32 <slaweq> great 15:02:35 <slaweq> so lets start 15:02:43 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:02:51 <slaweq> #topic Actions from previous meetings 15:02:56 <slaweq> ralonsoh to try to check how to limit number of logged lines in FT output 15:03:06 <ralonsoh> I'm on it (again) 15:03:24 <ralonsoh> I have two places: the DB and the engine facade 15:03:37 <ralonsoh> we can remove the n-lib condition about .session and ._session 15:03:43 <ralonsoh> about the DB... still checking 15:03:48 <slaweq> by engine facade You mean that "deprecation" warning? 15:03:52 <ralonsoh> I'll push both pachtes 15:03:54 <ralonsoh> yes 15:04:10 <slaweq> https://review.opendev.org/c/openstack/neutron-lib/+/779720\ 15:04:16 <slaweq> it's there already 15:04:31 <ralonsoh> and I did that before you, some months ago 15:04:36 <slaweq> but we decided to wait with it after release of wallaby 15:04:45 <ralonsoh> but it wasn't working fine 15:04:46 <slaweq> I know, I missed that Your patch 15:04:56 <slaweq> https://review.opendev.org/c/openstack/neutron/+/779721 15:05:04 <slaweq> I needed also that ^^ patch in neutron 15:05:53 <ralonsoh> ok, so for X then 15:06:29 <slaweq> yes 15:07:00 <slaweq> but I think that this db issue logged many times is more problematic there 15:07:13 <ralonsoh> yes and I'm still checking this 15:07:43 <slaweq> thx 15:07:51 <slaweq> #action ralonsoh to try to check how to limit number of logged lines in FT output 15:08:00 <slaweq> just a reminder for next week :) 15:10:03 <slaweq> ok, next one 15:10:09 <slaweq> jlibosva to fix collecting ovn logs in functional jobs 15:10:48 <slaweq> I just pinged him, maybe he will join 15:11:24 <jlibosva> o/ 15:11:48 <slaweq> hi jlibosva 15:11:55 <slaweq> we are checking actions from last week 15:11:59 <slaweq> jlibosva to fix collecting ovn logs in functional jobs 15:12:14 <jlibosva> yeah, so I found out that functional job doesn't run ovn-controller, we have fake chassis there 15:12:36 <jlibosva> as for the LP bug - I still need to investigate, I didn't get to it. My apologies 15:13:06 <slaweq> ok, will You try to check it this week? 15:13:26 <jlibosva> maybe early next week, I'll try before the next ci mtg 15:13:46 <slaweq> ok, thx 15:14:28 <slaweq> jlibosva: just to be sure, it was related to https://bugs.launchpad.net/neutron/+bug/1918266, right? 15:14:29 <openstack> Launchpad bug 1918266 in neutron "Functional test test_gateway_chassis_rebalance failing due to "failed to bind logical router"" [High,Confirmed] 15:14:36 <jlibosva> slaweq: yes 15:14:44 <slaweq> ok 15:14:52 <slaweq> #action jlibosva to check LP https://bugs.launchpad.net/neutron/+bug/1918266 15:15:04 <slaweq> just as a reminder for next meeting :) 15:15:13 <slaweq> and with that I think we can move on 15:15:22 <slaweq> #topic Stadium projects 15:15:32 <slaweq> lajoskatona: anything new regarding ci of stadium? 15:15:44 <lajoskatona> nothing 15:15:54 <lajoskatona> all is quiet 15:15:59 <slaweq> which is good news :) 15:16:02 <slaweq> thx lajoskatona 15:16:12 <slaweq> #topic Stable branches 15:16:55 <slaweq> I didn't had time today to check stable branches ci 15:17:03 <slaweq> do You have anything related to them? 15:17:09 <ralonsoh> no 15:18:08 <slaweq> ok, so next topic 15:18:11 <slaweq> #topic Grafana 15:18:17 <slaweq> #link http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:20:49 <slaweq> I don't see anything going very bad there 15:21:02 <slaweq> still functional tests are most problematic 15:21:16 <ralonsoh> yes but better than 4 weeks ago 15:21:28 <ralonsoh> we still have some timeout problems 15:21:34 <ralonsoh> but fewer 15:21:37 <slaweq> true 15:21:54 <ralonsoh> (and, I promise, I'm still working on this) 15:22:09 <ralonsoh> but is quite difficult to reproduce it locally 15:22:31 <slaweq> the issue with sql_connection parameter in functional tests? 15:22:37 <ralonsoh> no no 15:22:41 <ralonsoh> the FT timeouts 15:22:45 <slaweq> ahh, true 15:23:00 <ralonsoh> sql problem is almost trivial 15:23:02 <slaweq> that will be hard to reproduce 15:23:53 <slaweq> IMO You can first fix the issue with sql, and it is very likely that timeouts will be gone 15:25:16 <slaweq> and speaking about functional tests 15:25:25 <slaweq> #topic fullstack/functional 15:25:33 <slaweq> I found couple of failures last week 15:25:43 <slaweq> first one 15:25:54 <slaweq> timeout while spawning metadata proxy: 15:25:55 <slaweq> https://b5d154aeef4479021ab9-b7d34d92d3c0fa84052da2d7ba8871be.ssl.cf1.rackcdn.com/777535/1/check/neutron-functional-with-uwsgi/bb88e2f/testr_results.html 15:26:00 <slaweq> https://175b7ab9447c4ad3cb55-7660c6eb3fc520d2639a30e0db226704.ssl.cf2.rackcdn.com/765846/7/gate/neutron-functional-with-uwsgi/0099700/testr_results.html 15:26:03 <slaweq> https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_334/773283/15/check/neutron-functional-with-uwsgi/334549a/testr_results.html 15:26:06 <slaweq> at least 3 times 15:26:11 <slaweq> did You saw it maybe? 15:26:30 <ralonsoh> no 15:26:34 <ralonsoh> but I can check it 15:26:45 <slaweq> thx ralonsoh 15:26:56 <slaweq> do You want me to report LP for that? 15:27:05 <ralonsoh> perfect for me 15:27:07 <slaweq> ok 15:27:14 <slaweq> I will do it just after the meeting 15:27:34 <slaweq> #action ralonsoh to check timeout while spawning metadata proxy in functional tests 15:27:49 <slaweq> next one is timeout while disabling processes: 15:27:55 <slaweq> https://cb02acb070cff5b90918-a2819ce1e1d01d4ba47cf0d26d5585a5.ssl.cf5.rackcdn.com/778993/6/gate/neutron-functional-with-uwsgi/ebebe12/testr_results.html 15:27:57 <slaweq> https://0ee130e4a5a2946be1c4-63ec39ca44f5300255d15df097f93e08.ssl.cf1.rackcdn.com/780054/2/check/neutron-functional-with-uwsgi/bc96480/testr_results.html 15:27:59 <slaweq> https://2593fe2c1676d625fafc-b5f24c1063ad83372f17d0890078a578.ssl.cf5.rackcdn.com/765846/7/check/neutron-functional-with-uwsgi/29b9cdc/testr_results.html 15:28:01 <slaweq> https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_b83/763231/16/check/neutron-functional-with-uwsgi/b836250/testr_results.html 15:28:19 <slaweq> ralonsoh: isn't that "second part" of one of the LPs which You were working on? 15:28:36 <ralonsoh> not really 15:28:42 <ralonsoh> that was related to the ns creation 15:29:05 <slaweq> ok 15:29:09 <slaweq> so that is something new 15:29:10 <ralonsoh> but I've seen this message last week many times 15:29:27 <ralonsoh> what I don't understand is how we can timeout killing a process 15:29:38 <ralonsoh> no process will catch this signal 15:30:13 <slaweq> it is calling privsep 15:30:25 <slaweq> so maybe it's some issue in privsep or how we call that lib 15:30:34 <ralonsoh> yes and uses "kill -9" 15:30:48 <ralonsoh> I tried some time ago to use os.kill 15:30:56 <ralonsoh> but it was a disaster 15:31:51 <ralonsoh> ok, now we have moved almost everything to privsep, I'll try again this patch 15:32:01 <slaweq> IMO it is timeing out in privsep while waiting to response 15:32:12 <ralonsoh> should work the same as cmd "kill" but natively 15:32:19 <ralonsoh> yes 15:32:20 <slaweq> would be good to try 15:32:22 <slaweq> thx 15:32:26 <ralonsoh> perfect then 15:32:32 <ralonsoh> will you open a LP? 15:32:35 <slaweq> #action ralonsoh to try to move to os.kill 15:32:38 <slaweq> yes, I will 15:32:40 <ralonsoh> perfect 15:32:47 <slaweq> thank You 15:33:02 <slaweq> next one 15:33:06 <slaweq> failure in neutron.tests.functional.agent.l3.test_dvr_router.TestDvrRouter.test_dvr_ha_router_failover_without_g 15:33:18 <slaweq> https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_415/779149/2/check/neutron-functional-with-uwsgi/4158175/testr_results.html 15:33:20 <slaweq> https://a1a9878be58cf3e65212-2851babb3178649b1b47ad8e0ca050d1.ssl.cf1.rackcdn.com/780054/2/check/neutron-functional-with-uwsgi/4a7ea78/testr_results.html 15:33:25 <slaweq> I saw it twice 15:33:55 <slaweq> in that case it's waiting for router to switch to "backup" 15:34:11 <slaweq> I can take a look at it 15:34:18 <ralonsoh> thanks 15:34:33 <slaweq> #slaweq to check failing neutron.tests.functional.agent.l3.test_dvr_router.TestDvrRouter.test_dvr_ha_router_failover_without_gw 15:35:18 <slaweq> other failures which I found was single failures so I listed them in https://etherpad.opendev.org/p/neutron-ci-meetings but I don't think we need to check each of them now 15:35:27 <slaweq> lets see if they will happen more often first 15:36:53 <slaweq> ok, I think we can move on 15:37:02 <slaweq> #topic periodic 15:37:12 <slaweq> neutron-lib from master is causing UT failures: https://bugs.launchpad.net/neutron/+bug/1919280 15:37:15 <openstack> Launchpad bug 1919280 in neutron "UT are failing when runs with neutron-lib master" [Critical,Confirmed] - Assigned to Slawek Kaplonski (slaweq) 15:37:15 <slaweq> this is new issue 15:37:20 <slaweq> and ralonsoh is already on it 15:37:25 <ralonsoh> https://review.opendev.org/c/openstack/neutron/+/780802 15:37:43 <ralonsoh> it has a depends-on because neutron-lib 2.10.0 is not in openstack/requirements 15:37:47 <ralonsoh> but should be the same 15:38:15 <ralonsoh> https://zuul.opendev.org/t/openstack/status#780802 15:38:21 <slaweq> depends-on will not work here 15:38:25 <ralonsoh> I still need to check some UTs 15:38:30 <ralonsoh> why? 15:38:39 <slaweq> in UT jobs neutron-lib isn't installed from source 15:38:43 <slaweq> but from pypi 15:38:48 <ralonsoh> upssss 15:38:51 <ralonsoh> ok then 15:39:02 <ralonsoh> now I understand those 4 UTs failing 15:39:13 <ralonsoh> same as when I executed py38 locally 15:39:17 <slaweq> so You need to bump neutron-lib version in requirements repo and in neutron 15:39:18 <ralonsoh> without patching n-lib 15:39:27 <slaweq> and then it will install new version of neutron-lib 15:39:49 <ralonsoh> I'll push the patch for requirements now 15:39:53 <slaweq> thx 15:40:11 <slaweq> please give me link, I will +1 it asap 15:40:22 <ralonsoh> https://review.opendev.org/c/openstack/requirements/+/780860 15:40:23 <lajoskatona> The releases patch is merged, and the u-c change is under review: https://review.opendev.org/c/openstack/requirements/+/780860 15:40:25 <ralonsoh> already there... 15:41:03 <slaweq> ralonsoh: so please update Your patch to bump neutron-lib version in requirements.txt and lower-constraints.txt 15:41:11 <ralonsoh> sure 15:41:18 <slaweq> and You can make Your patch depend-on https://review.opendev.org/c/openstack/requirements/+/780860 15:41:22 <slaweq> then it should works 15:42:19 <slaweq> thx ralonsoh for taking care of it 15:42:37 <slaweq> and that was last thing from me for today 15:42:44 <slaweq> do You have anything else regarding ci? 15:42:58 <ralonsoh> no 15:43:26 <slaweq> so I will give You some minutes back :) 15:43:30 <slaweq> thx for attending the meeting 15:43:33 <slaweq> o/ 15:43:34 <slaweq> #endmeeting