15:00:29 <slaweq> #startmeeting neutron_ci 15:00:30 <openstack> Meeting started Tue Feb 9 15:00:29 2021 UTC and is due to finish in 60 minutes. The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot. 15:00:31 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 15:00:33 <slaweq> hi 15:00:35 <openstack> The meeting name has been set to 'neutron_ci' 15:00:54 <bcafarel> \o 15:01:30 <lajoskatona> o/ 15:01:41 <jlibosva> o/ 15:02:53 <slaweq> ok, let's start 15:03:01 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:03:41 <slaweq> First of all, I have one short announcement 15:03:56 <slaweq> recently I pushed couple of patches related to our ci jobs: 15:04:00 <slaweq> https://review.opendev.org/q/topic:%22improve-neutron-ci%22+(status:open%20OR%20status:merged) 15:04:13 <slaweq> please check the open ones 15:04:36 <slaweq> I hope it will reduce a bit load which we are doing on infra and improve our CI failure rate a bit 15:05:24 <lajoskatona> good to have topic for it, it's easier to check all of these 15:05:32 <lajoskatona> I just relaised it 15:05:43 <slaweq> :) 15:06:05 <slaweq> ok, now we can move to the regular topics 15:06:08 <slaweq> #topic Actions from previous meetings 15:06:18 <slaweq> there was bunch assigned to ralonsoh 15:06:30 <slaweq> he is not here today, so I will reasign them to him for next week also 15:06:40 <slaweq> just to not forget 15:06:44 <slaweq> #action ralonsoh to check problem with (not) deleted pid files in functional tests 15:06:51 <slaweq> #action alonsoh to check failing periodic task in functional test 15:06:54 <slaweq> #undo 15:06:55 <openstack> Removing item from minutes: #action alonsoh to check failing periodic task in functional test 15:06:57 <slaweq> #action ralonsoh to check failing periodic task in functional test 15:07:09 <slaweq> ok, and now others :) 15:07:13 <slaweq> slaweq to update grafana dashboard 15:07:19 <slaweq> Patch https://review.opendev.org/c/openstack/project-config/+/772643 15:07:33 <slaweq> it's now merged 15:07:41 <slaweq> next one 15:07:43 <slaweq> slaweq to report functional tests timeout in LP 15:07:50 <slaweq> Bug reported: https://bugs.launchpad.net/neutron/+bug/1913401 15:07:51 <openstack> Launchpad bug 1913401 in neutron "Timeout during creation of interface in functional tests" [High,Confirmed] 15:08:09 <slaweq> for now it's not assigned to anyone 15:08:46 <slaweq> next one 15:08:48 <slaweq> lajoskatona to check with rubasov test_metadata_routed failure 15:10:48 <slaweq> lajoskatona: any updates about it? 15:12:00 <lajoskatona> good question, i have to check ti 15:12:01 <lajoskatona> it 15:12:07 <slaweq> :) 15:12:10 <slaweq> ok 15:12:18 <slaweq> let's move on for now 15:12:21 <lajoskatona> https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/772650 15:12:33 <lajoskatona> I added console logs, but forget it to tell the truth 15:12:48 <slaweq> ahh, I remember that patch now ;) 15:13:49 <slaweq> I will try to ping people to review it 15:14:15 <lajoskatona> I can ask aroud as well, thanks 15:14:22 <slaweq> thx a lot 15:14:41 <lajoskatona> I just tend to forget these things :-( 15:14:59 <slaweq> sure, I totally understand 15:15:31 <slaweq> ok, next one 15:15:34 <slaweq> slaweq to check why console output wasn't checked in the failed tests from tempest.api.compute.servers.test_create_server.ServersTestJSON 15:15:42 <slaweq> I was checking that log but I don't know why it was like that, IMO all needed code is there in tempest so console output should be checked. 15:15:44 <slaweq> If that will happen more often I will get back to this... 15:17:19 <slaweq> ok, let's move on to the next topic now 15:17:23 <slaweq> #topic Stadium projects 15:17:28 <slaweq> any updates/problems? 15:18:19 <lajoskatona> I have still the same odl and bgpvpn patches: https://review.opendev.org/c/openstack/networking-bgpvpn/+/771219 & https://review.opendev.org/c/openstack/networking-odl/+/769877 15:18:57 <lajoskatona> it;s my fault, I missed to delete the l-c job from gate queue 15:19:08 <bcafarel> and a few stable ones to go along in https://review.opendev.org/q/topic:%2522oslo_lc_drop%2522+status:open+label:Verified%252B1 (networking-*) 15:19:23 <bcafarel> let's wrap up showing the way out to l-c :) 15:20:02 <slaweq> lajoskatona: I just commented https://review.opendev.org/c/openstack/networking-bgpvpn/+/771219 15:20:14 <slaweq> please remove it completly from the gate queue 15:20:22 <slaweq> we don't have non-voting jobs there 15:20:45 <lajoskatona> slaweq: ok, I change that 15:22:05 <slaweq> thx 15:22:35 <slaweq> bcafarel: ok, I will review them ASAP 15:23:54 <bcafarel> slaweq: thanks, may need other stable core reviewer for some as I remember you already +2'ed some of them 15:24:23 <slaweq> we can ping ralonsoh tomorrow :) 15:24:50 <bcafarel> indeed :) 15:25:23 <slaweq> ok, so let's move on 15:25:27 <slaweq> #topic Stable branches 15:25:31 <slaweq> Victoria dashboard: https://grafana.opendev.org/d/HUCHup2Gz/neutron-failure-rate-previous-stable-release?orgId=1 15:25:33 <slaweq> Ussuri dashboard: https://grafana.opendev.org/d/smqHXphMk/neutron-failure-rate-older-stable-release?orgId=1 15:25:51 <bcafarel> from recent backports, it looked quite good in stable branches 15:26:05 <slaweq> yes, that's also my impression :) 15:26:06 <bcafarel> I saw a series up to queens with zuul+1 on first try for all backports that was nice to see! 15:26:12 <slaweq> :) 15:27:16 <bcafarel> I had a not directly CI related question in https://review.opendev.org/c/openstack/neutron/+/772098 backport 15:27:52 <bcafarel> as it bumps requirements on ovsdbapp, though new version is still in victoria constraints (just saw you had similar question in ussuri one) 15:28:58 <slaweq> hmm 15:29:14 <slaweq> this patch bumps requirements but not lower-constraints 15:29:28 <slaweq> so I wonder how this will work e.g. with ovsdbapp 1.3.0 15:29:38 <bcafarel> well we dropped l-c in stable :) 15:29:39 <slaweq> which is the lower possible version 15:29:48 <slaweq> ah, yes 15:30:14 <bcafarel> and yes I had similar question I guess if otherwiseguy says it is safe with older versions (just having the race condition still around) I guess it would be good to go 15:30:21 <slaweq> but didn't we made it because we are not changing requirements in stable? 15:30:32 * otherwiseguy catches up 15:32:44 * otherwiseguy is checking ovsdbapp code 15:33:03 <slaweq> I think that this is also question to the stable-maint team 15:33:43 <otherwiseguy> I think it should be fine, the ovsdbapp.backend.ovs_id.event.RowEvent has been there since 2017. 15:33:53 <slaweq> IIRC some time ago ralonsoh wanted to bump pyroute2 version in stable branches to fix some memory leak and it wasn't allowed 15:34:25 <slaweq> otherwiseguy: so would it be possible to backport that patch without change in requirements? 15:34:27 <otherwiseguy> oh, wait. 15:34:34 <otherwiseguy> The RowEventHandler was new though. 15:34:46 <otherwiseguy> So yeah, it won't work with older. 15:35:01 <otherwiseguy> Yeah, it's not that hard to do w/o the requirement. 15:35:07 <bcafarel> ah that was pyroute2 I was trying to remember 15:35:15 <bcafarel> oh fix without bump would be nice 15:35:33 <otherwiseguy> In fact a newer related patch does what this would have needed to do anyway. 15:35:33 <slaweq> fix without bump requirements would be ok, now I don't think we can accept that 15:35:44 <otherwiseguy> I'll get it squared away today. 15:35:49 <slaweq> thx 15:36:11 <slaweq> can we move on now? 15:36:15 <otherwiseguy> ++ 15:36:57 <bcafarel> sure, thanks :) 15:37:00 <slaweq> ok, next topic 15:37:03 <slaweq> #topic Grafana 15:37:07 <slaweq> http://grafana.openstack.org/dashboard/db/neutron-failure-rate 15:37:22 <slaweq> looking at graphs, I think that it is in pretty good shape still 15:37:47 <slaweq> there are some failures which we should check probably but nothing really critical IMHO 15:38:49 <slaweq> I think there is nothing else to discuss about grafana today 15:39:03 <slaweq> and we can move on to the specific jobs' failures 15:39:07 <slaweq> are You ok with that? 15:39:47 <bcafarel> +1, those graphs looked OK in fast scroll 15:39:54 <slaweq> #topic fullstack/functional 15:40:04 <slaweq> First functional tests 15:40:07 <slaweq> https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_24e/774245/1/check/neutron-functional-with-uwsgi/24e5ec7/testr_results.html 15:41:04 <slaweq> there seems to be problem with bridge creation 15:41:11 <slaweq> anyone wants to check that one? 15:42:25 <slaweq> strange that log file is empty: http://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_24e/774245/1/check/neutron-functional-with-uwsgi/24e5ec7/controller/logs/dsvm-functional-logs/neutron.tests.functional.agent.common.test_ovs_lib.BaseOVSTestCase.test_get_egress_min_bw_for_port.txt 15:43:27 <slaweq> I saw it only once for now 15:43:35 <slaweq> so maybe just some general node slow down 15:43:47 <slaweq> lets maybe keep an eye on it and see if that will repeat 15:44:06 <bcafarel> or maybe some system issue there 15:44:12 <slaweq> maybe 15:44:16 <slaweq> :) 15:44:19 <slaweq> ok, next one: 15:44:21 <slaweq> https://868396e386f8d4e621fb-ed8719f87969200ae690156046a9dd5f.ssl.cf1.rackcdn.com/774621/1/check/neutron-functional-with-uwsgi/e6c177b/testr_results.html 15:45:25 <slaweq> strange 15:45:31 <slaweq> this time log is again empty 15:46:06 <slaweq> anyone wants to check why those functional tests logs are empty? 15:47:22 <bcafarel> I can dig a bit (at both actually) 15:47:30 <slaweq> thx 15:47:46 <slaweq> #action bcafarel to check why functional tests logs files are empty 15:48:32 <slaweq> ok, now fullstack 15:48:57 <slaweq> I saw at least twice this week errors with IP allocation in subnet, like: 15:48:59 <slaweq> https://d0a5505ffe75322ff4b4-851002f2a2f4fd257d2d88d1bd1bf1ab.ssl.cf1.rackcdn.com/765299/4/check/neutron-tempest-plugin-scenario-openvswitch-iptables_hybrid/cb7f73c/testr_results.html 15:49:02 <slaweq> and 15:49:10 <slaweq> https://e2787c106aae2566c67a-4ba9660b2d50a57315db5ad5fefc4cbb.ssl.cf1.rackcdn.com/761420/5/check/neutron-fullstack-with-uwsgi/53ca42e/testr_results.html 15:49:29 <slaweq> probably it's just a slow down of the node and some timeout hit 15:49:33 <slaweq> but we should check that 15:49:37 <slaweq> any volunteer? 15:50:56 <slaweq> I can check that 15:51:14 <slaweq> #action slaweq to check ip allocation failures in fullstack tests 15:51:31 <slaweq> ok, and one more issue 15:51:37 <slaweq> this time with scenario job 15:51:39 <slaweq> #topic Tempest/Scenario 15:51:43 <slaweq> Unable to finish operation on subnet... 15:51:45 <slaweq> https://d0a5505ffe75322ff4b4-851002f2a2f4fd257d2d88d1bd1bf1ab.ssl.cf1.rackcdn.com/765299/4/check/neutron-tempest-plugin-scenario-openvswitch-iptables_hybrid/cb7f73c/testr_results.html 15:51:52 <slaweq> I saw similar issues couple of times recently 15:52:05 <slaweq> it's some issue during the cleanup 15:52:34 <slaweq> anyone wants to check why there was such error there? 15:53:37 <lajoskatona> I can check 15:53:42 <slaweq> thx lajoskatona 15:53:57 <slaweq> #action lajoskatona to check subnet in use errors in scenario jobs 15:54:09 <slaweq> and that's all what I have for today 15:54:16 <slaweq> do You have anything else You want to discuss? 15:54:40 <bcafarel> nothing from me 15:54:40 <slaweq> if not, I will give You back few minutes today 15:55:35 <slaweq> ok, so thx for attending 15:55:54 <slaweq> have a great week and only +1 and +2 from Zuul ;) 15:55:58 <slaweq> #endmeeting