15:00:58 <slaweq> #startmeeting neutron_ci 15:00:58 <opendevmeet> Meeting started Tue Aug 10 15:00:58 2021 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:58 <opendevmeet> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:58 <opendevmeet> The meeting name has been set to 'neutron_ci' 15:02:39 <obondarev> hi 15:02:47 <slaweq> hi 15:02:59 <slaweq> let's wait few more minutes for lajoskatona 15:03:08 <slaweq> I know ralonsoh_ will be late for the meeting today 15:03:12 <lajoskatona> Hi 15:03:13 <slaweq> and bcafarel is on pto 15:03:15 <slaweq> hi lajoskatona 15:03:21 <slaweq> so I think we can start now 15:03:31 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:03:39 <slaweq> #topic Actions from previous meetings 15:03:44 <slaweq> slaweq to check networking-bgpvpn stable branches failures 15:03:51 <slaweq> I checked those failures 15:04:12 <slaweq> and the problem is that horizon is already EOL in queens and pike 15:04:36 <lajoskatona> as I remember your patch using the tag for horizon was not enough? 15:04:36 <slaweq> I tried to use last eol tag from horizon in the networking-bgpvpn ci jobs 15:04:43 <slaweq> but there were some other issues as well 15:04:50 <slaweq> so I proposed http://lists.openstack.org/pipermail/openstack-discuss/2021-August/024014.html 15:06:29 <lajoskatona> We have to wait for answers if somebody still needs the branch? 15:07:39 <slaweq> lajoskatona: yes, that's official procedure 15:07:46 <slaweq> so I wanted to wait until end of this week 15:07:49 <lajoskatona> ok 15:09:08 <slaweq> ok, next one 15:09:23 <slaweq> bcafarel to check vpnaas failures in stable branches 15:09:32 <slaweq> He started it, and progress is tracked in https://etherpad.opendev.org/p/neutron-periodic-stable-202107 15:10:02 <slaweq> and that were all actions from last week 15:10:06 <slaweq> next topic 15:10:11 <slaweq> #topic Stadium projects 15:10:15 <slaweq> lajoskatona: any updates? 15:10:51 <lajoskatona> yeah I collected a few patcheswhich wait for review 15:11:02 <lajoskatona> https://paste.opendev.org/show/807983/ 15:11:30 <lajoskatona> I run through them and mostly simple ones on master or some on stable branches 15:11:56 <slaweq> thx 15:12:05 <lajoskatona> I tried to make them to shape (rebase, whatever....), so please check them if you have some spare time 15:12:05 <slaweq> I will go through them tomorrow morning 15:12:15 <slaweq> for sure 15:16:09 <slaweq> I think we can move on then, right? 15:16:31 <slaweq> I will skip stable branches today as we don't have bcafarel here 15:16:59 <lajoskatona> ack 15:18:06 <slaweq> #topic Grafana 15:18:39 <slaweq> more or less it looks fine for me 15:19:10 <slaweq> do You see anything critical there? 15:20:45 <slaweq> let's move on 15:20:52 <slaweq> #topic fullstack/functional 15:21:18 <slaweq> in most cases I saw failures related to the ovn bug which obondarev already mentioned earlier today 15:21:31 <slaweq> and for which jlibosva proposed some extra logs already 15:21:39 <jlibosva> o/ 15:21:47 <lajoskatona> https://bugs.launchpad.net/neutron/+bug/1938766 15:22:03 <slaweq> yes, that one 15:22:39 <jlibosva> the extra logging patch is here: https://review.opendev.org/c/openstack/neutron/+/803936 15:22:42 <slaweq> jlibosva: maybe You can try to recheck it couple of times before we will merge it? 15:22:47 <jlibosva> slaweq: yeah, I can 15:23:02 <ralonsoh> hi 15:23:06 <slaweq> or do You think it would be useful to have those logs there for the future? 15:23:12 <slaweq> hi ralonsoh :) 15:23:34 <jlibosva> I don't think it will harm to have it, it's a log message that happens once for each test 15:23:52 <slaweq> good for me 15:24:07 <slaweq> so let's merge it and we will for sure get some failure soon :) 15:26:03 <slaweq> from other issues in functional job, I found one failure of neutron.tests.functional.agent.l3.test_ha_router.L3HATestCase.test_ipv6_router_advts_and_fwd_after_router_state_change_backup 15:26:10 <slaweq> https://a1fab4006c6a1daf82f2-bd8cbc347d913753596edf9ef5797d55.ssl.cf1.rackcdn.com/786478/17/check/neutron-functional-with-uwsgi/7250dcf/testr_results.html 15:26:27 <slaweq> and TBH I think I saw similar issues in the past already 15:26:31 <slaweq> so I will report LP for that 15:26:41 <slaweq> and I will try to investigate it if I will have some time 15:27:02 <slaweq> #action slaweq to report failure in test_ipv6_router_advts_and_fwd_after_router_state_change_backup functional test 15:27:15 <slaweq> or maybe do You already know that issue? :) 15:27:34 <ralonsoh> I don't, sorry 15:27:54 <obondarev> me neither 15:28:13 <slaweq> ok, I will report it and we will see :) 15:28:18 <slaweq> next topic :) 15:28:22 <slaweq> #topic Tempest/Scenario 15:28:30 <slaweq> here I just have one new issue 15:28:36 <slaweq> https://storage.gra.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_757/803462/2/check/neutron-tempest-plugin-scenario-openvswitch/7575466/controller/logs/screen-n-api.txt 15:28:50 <slaweq> it seems that mysql server was killed by oom-killer 15:29:05 <ralonsoh> same as last week in fullstack 15:29:09 <slaweq> so something similar what we see in fullstack jobs recently 15:29:14 <ralonsoh> yeah 15:29:32 <slaweq> in fullstack I though it's because we are spawning many neutron processes at once 15:29:33 <ralonsoh> (we need bigger VMs...) 15:29:47 <slaweq> each test has own neutron-server, agents etc 15:30:19 <lajoskatona> for tempest even decreasing concureency will not help, or am I wrong? 15:30:30 <slaweq> but now, as I saw similar issue in the scenario job, my question is: should we have bigger VMs or do we maybe have issues with memory consumption by our processes? 15:30:42 <slaweq> lajoskatona: yes, it won't help for sure for tempest 15:30:53 <slaweq> for now I saw it only once 15:31:09 <ralonsoh> we are always on the limit 15:31:11 <slaweq> but I wanted to raise it here so all of You will be aware about it 15:31:37 <ralonsoh> I think we have increased the memory consumption when I pushed the patches for new privsep contexts 15:31:45 <ralonsoh> that creates new daemons per context 15:31:56 <ralonsoh> that's something needed to segregate the permissions 15:32:06 <ralonsoh> but the memory consumption is the drawback 15:32:30 <slaweq> maybe we should raise this topic on the ML for wider audience? 15:32:52 <slaweq> maybe we should try to use slightly bigger vms now? 15:33:03 <ralonsoh> that could help 15:33:11 <ralonsoh> the vms have 8GB, right? 15:33:15 <slaweq> I think so 15:33:27 <ralonsoh> 10-12 could be perfect for uss 15:33:41 <ralonsoh> I'll write the mail today 15:33:48 <slaweq> I know it would help - but I don't want to raise it and got answer like "you should optimize Your software and not always request more memory" :) 15:33:51 <ralonsoh> pointing to this conversation 15:33:56 <slaweq> You know what I mean I hope 15:33:58 <opendevreview> Mamatisa Nurmatov proposed openstack/neutron master: use payloads for FLOATING_IP https://review.opendev.org/c/openstack/neutron/+/801874 15:34:08 <slaweq> that's why I raised it here also 15:34:34 <slaweq> so maybe we should first prepare some analysis or explanations why we thing that vms should be bigger :) 15:34:40 <lajoskatona> If it is related to privsep it can be a common problem as more and more projects change to it 15:34:49 <slaweq> yes, that's true 15:35:10 <slaweq> and that is IMO good reason why we would want to do such change 15:35:22 <slaweq> thx ralonsoh for volunteering to write email about it 15:35:27 <ralonsoh> yw 15:35:43 <slaweq> #action ralonsoh to send email about memory in CI vms 15:36:22 <slaweq> ok, last topic for today 15:36:28 <slaweq> #topic Periodic 15:36:46 <slaweq> here I have only one thing 15:36:55 <slaweq> our neutron-ovn-tempest-ovs-master-fedora is broken again 15:37:02 <ralonsoh> pffff 15:37:13 <slaweq> and it seems that nodes where updated to fedora 34 which isn't supported by devstack yet 15:37:19 <slaweq> https://zuul.openstack.org/build/d974a9b6e1854d21a30e0c541ff56cc4 15:37:31 <slaweq> I can check how to fix that 15:37:56 <slaweq> unless anyone else wants to work on it :) 15:38:03 <ralonsoh> I can 15:38:14 <slaweq> if You have time :) 15:38:15 <ralonsoh> I'll create a VM with f34 15:38:27 <slaweq> thx ralonsoh 15:38:28 <ralonsoh> well, I'll try to stack with a f34 vm 15:38:46 <slaweq> #action ralonsoh to check neutron-ovn-tempest-ovs-master-fedora job failures 15:38:48 <ralonsoh> is there a LP bug?? 15:38:51 <ralonsoh> just asking 15:39:01 <slaweq> ralonsoh: no, there's no any bug reported for that 15:39:08 <ralonsoh> perfect, I'll do it 15:39:12 <slaweq> ++ 15:39:14 <slaweq> thx 15:39:36 <slaweq> that was my last topic for today 15:39:45 <slaweq> do You have anything else You want to discuss? 15:39:52 <slaweq> or if not, we can finish earlier today 15:39:54 <obondarev> for dvr-ha check job - it's been pretty stable since last week DVR fix - may we consider it for voting? 15:40:06 <obondarev> just to prevent further DVR-HA regressions 15:40:13 <ralonsoh> +1 15:40:41 <slaweq> https://grafana.opendev.org/d/BmiopeEMz/neutron-failure-rate?viewPanel=18&orgId=1 15:40:48 <slaweq> indeed it seems like it is more stable now 15:40:58 <slaweq> we can try that 15:41:06 <slaweq> worst case, we can always revert it :) 15:41:15 <slaweq> obondarev: will You propose that? 15:41:30 <obondarev> sounds good :) yeah I will 15:41:40 <slaweq> thx a lot 15:41:45 <obondarev> sure 15:41:52 <slaweq> #action obondarev to promote dvr-ha job to be voting 15:43:15 <slaweq> ok, so if that's all for today, let's finish it earlier 15:43:21 <slaweq> thx for attending the meeting 15:43:25 <slaweq> #endmeeting