15:00:29 <slaweq> #startmeeting neutron_ci
15:00:30 <openstack> Meeting started Tue Feb  9 15:00:29 2021 UTC and is due to finish in 60 minutes.  The chair is slaweq. Information about MeetBot at http://wiki.debian.org/MeetBot.
15:00:31 <openstack> Useful Commands: #action #agreed #help #info #idea #link #topic #startvote.
15:00:33 <slaweq> hi
15:00:35 <openstack> The meeting name has been set to 'neutron_ci'
15:00:54 <bcafarel> \o
15:01:30 <lajoskatona> o/
15:01:41 <jlibosva> o/
15:02:53 <slaweq> ok, let's start
15:03:01 <slaweq> Grafana dashboard: http://grafana.openstack.org/dashboard/db/neutron-failure-rate
15:03:41 <slaweq> First of all, I have one short announcement
15:03:56 <slaweq> recently I pushed couple of patches related to our ci jobs:
15:04:00 <slaweq> https://review.opendev.org/q/topic:%22improve-neutron-ci%22+(status:open%20OR%20status:merged)
15:04:13 <slaweq> please check the open ones
15:04:36 <slaweq> I hope it will reduce a bit load which we are doing on infra and improve our CI failure rate a bit
15:05:24 <lajoskatona> good to have topic for it, it's easier to check all of these
15:05:32 <lajoskatona> I just relaised it
15:05:43 <slaweq> :)
15:06:05 <slaweq> ok, now we can move to the regular topics
15:06:08 <slaweq> #topic Actions from previous meetings
15:06:18 <slaweq> there was bunch assigned to ralonsoh
15:06:30 <slaweq> he is not here today, so I will reasign them to him for next week also
15:06:40 <slaweq> just to not forget
15:06:44 <slaweq> #action ralonsoh to check problem with (not) deleted pid files in functional tests
15:06:51 <slaweq> #action alonsoh to check failing periodic task in functional test
15:06:54 <slaweq> #undo
15:06:55 <openstack> Removing item from minutes: #action alonsoh to check failing periodic task in functional test
15:06:57 <slaweq> #action ralonsoh to check failing periodic task in functional test
15:07:09 <slaweq> ok, and now others :)
15:07:13 <slaweq> slaweq to update grafana dashboard
15:07:19 <slaweq> Patch https://review.opendev.org/c/openstack/project-config/+/772643
15:07:33 <slaweq> it's now merged
15:07:41 <slaweq> next one
15:07:43 <slaweq> slaweq to report functional tests timeout in LP
15:07:50 <slaweq> Bug reported: https://bugs.launchpad.net/neutron/+bug/1913401
15:07:51 <openstack> Launchpad bug 1913401 in neutron "Timeout during creation of interface in functional tests" [High,Confirmed]
15:08:09 <slaweq> for now it's not assigned to anyone
15:08:46 <slaweq> next one
15:08:48 <slaweq> lajoskatona to check with rubasov test_metadata_routed failure
15:10:48 <slaweq> lajoskatona: any updates about it?
15:12:00 <lajoskatona> good question, i have to check ti
15:12:01 <lajoskatona> it
15:12:07 <slaweq> :)
15:12:10 <slaweq> ok
15:12:18 <slaweq> let's move on for now
15:12:21 <lajoskatona> https://review.opendev.org/c/openstack/neutron-tempest-plugin/+/772650
15:12:33 <lajoskatona> I added console logs, but forget it to tell the truth
15:12:48 <slaweq> ahh, I remember that patch now ;)
15:13:49 <slaweq> I will try to ping people to review it
15:14:15 <lajoskatona> I can ask aroud as well, thanks
15:14:22 <slaweq> thx a lot
15:14:41 <lajoskatona> I just tend to forget these things :-(
15:14:59 <slaweq> sure, I totally understand
15:15:31 <slaweq> ok, next one
15:15:34 <slaweq> slaweq to check why console output wasn't checked in the failed tests from tempest.api.compute.servers.test_create_server.ServersTestJSON
15:15:42 <slaweq> I was checking that log but I don't know why it was like that, IMO all needed code is there in tempest so console output should be checked.
15:15:44 <slaweq> If that will happen more often I will get back to this...
15:17:19 <slaweq> ok, let's move on to the next topic now
15:17:23 <slaweq> #topic Stadium projects
15:17:28 <slaweq> any updates/problems?
15:18:19 <lajoskatona> I have still the same odl and bgpvpn patches: https://review.opendev.org/c/openstack/networking-bgpvpn/+/771219 & https://review.opendev.org/c/openstack/networking-odl/+/769877
15:18:57 <lajoskatona> it;s my fault, I missed to delete the l-c job from gate queue
15:19:08 <bcafarel> and a few stable ones to go along in https://review.opendev.org/q/topic:%2522oslo_lc_drop%2522+status:open+label:Verified%252B1 (networking-*)
15:19:23 <bcafarel> let's wrap up showing the way out to l-c :)
15:20:02 <slaweq> lajoskatona: I just commented https://review.opendev.org/c/openstack/networking-bgpvpn/+/771219
15:20:14 <slaweq> please remove it completly from the gate queue
15:20:22 <slaweq> we don't have non-voting jobs there
15:20:45 <lajoskatona> slaweq: ok, I change that
15:22:05 <slaweq> thx
15:22:35 <slaweq> bcafarel: ok, I will review them ASAP
15:23:54 <bcafarel> slaweq: thanks, may need other stable core reviewer for some as I remember you already +2'ed some of them
15:24:23 <slaweq> we can ping ralonsoh tomorrow :)
15:24:50 <bcafarel> indeed :)
15:25:23 <slaweq> ok, so let's move on
15:25:27 <slaweq> #topic Stable branches
15:25:31 <slaweq> Victoria dashboard: https://grafana.opendev.org/d/HUCHup2Gz/neutron-failure-rate-previous-stable-release?orgId=1
15:25:33 <slaweq> Ussuri dashboard: https://grafana.opendev.org/d/smqHXphMk/neutron-failure-rate-older-stable-release?orgId=1
15:25:51 <bcafarel> from recent backports, it looked quite good in stable branches
15:26:05 <slaweq> yes, that's also my impression :)
15:26:06 <bcafarel> I saw a series up to queens with zuul+1 on first try for all backports that was nice to see!
15:26:12 <slaweq> :)
15:27:16 <bcafarel> I had a not directly CI related question in https://review.opendev.org/c/openstack/neutron/+/772098 backport
15:27:52 <bcafarel> as it bumps requirements on ovsdbapp, though new version is still in victoria constraints (just saw you had similar question in ussuri one)
15:28:58 <slaweq> hmm
15:29:14 <slaweq> this patch bumps requirements but not lower-constraints
15:29:28 <slaweq> so I wonder how this will work e.g. with ovsdbapp 1.3.0
15:29:38 <bcafarel> well we dropped l-c in stable :)
15:29:39 <slaweq> which is the lower possible version
15:29:48 <slaweq> ah, yes
15:30:14 <bcafarel> and yes I had similar question I guess if otherwiseguy says it is safe with older versions (just having the race condition still around) I guess it would be good to go
15:30:21 <slaweq> but didn't we made it because we are not changing requirements in stable?
15:30:32 * otherwiseguy catches up
15:32:44 * otherwiseguy is checking ovsdbapp code
15:33:03 <slaweq> I think that this is also question to the stable-maint team
15:33:43 <otherwiseguy> I think it should be fine, the ovsdbapp.backend.ovs_id.event.RowEvent has been there since 2017.
15:33:53 <slaweq> IIRC some time ago ralonsoh wanted to bump pyroute2 version in stable branches to fix some memory leak and it wasn't allowed
15:34:25 <slaweq> otherwiseguy: so would it be possible to backport that patch without change in requirements?
15:34:27 <otherwiseguy> oh, wait.
15:34:34 <otherwiseguy> The RowEventHandler was new though.
15:34:46 <otherwiseguy> So yeah, it won't work with older.
15:35:01 <otherwiseguy> Yeah, it's not that hard to do w/o the requirement.
15:35:07 <bcafarel> ah that was pyroute2 I was trying to remember
15:35:15 <bcafarel> oh fix without bump would be nice
15:35:33 <otherwiseguy> In fact a newer related patch does what this would have needed to do anyway.
15:35:33 <slaweq> fix without bump requirements would be ok, now I don't think we can accept that
15:35:44 <otherwiseguy> I'll get it squared away today.
15:35:49 <slaweq> thx
15:36:11 <slaweq> can we move on now?
15:36:15 <otherwiseguy> ++
15:36:57 <bcafarel> sure, thanks :)
15:37:00 <slaweq> ok, next topic
15:37:03 <slaweq> #topic Grafana
15:37:07 <slaweq> http://grafana.openstack.org/dashboard/db/neutron-failure-rate
15:37:22 <slaweq> looking at graphs, I think that it is in pretty good shape still
15:37:47 <slaweq> there are some failures which we should check probably but nothing really critical IMHO
15:38:49 <slaweq> I think there is nothing else to discuss about grafana today
15:39:03 <slaweq> and we can move on to the specific jobs' failures
15:39:07 <slaweq> are You ok with that?
15:39:47 <bcafarel> +1, those graphs looked OK in fast scroll
15:39:54 <slaweq> #topic fullstack/functional
15:40:04 <slaweq> First functional tests
15:40:07 <slaweq> https://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_24e/774245/1/check/neutron-functional-with-uwsgi/24e5ec7/testr_results.html
15:41:04 <slaweq> there seems to be problem with bridge creation
15:41:11 <slaweq> anyone wants to check that one?
15:42:25 <slaweq> strange that log file is empty: http://storage.bhs.cloud.ovh.net/v1/AUTH_dcaab5e32b234d56b626f72581e3644c/zuul_opendev_logs_24e/774245/1/check/neutron-functional-with-uwsgi/24e5ec7/controller/logs/dsvm-functional-logs/neutron.tests.functional.agent.common.test_ovs_lib.BaseOVSTestCase.test_get_egress_min_bw_for_port.txt
15:43:27 <slaweq> I saw it only once for now
15:43:35 <slaweq> so maybe just some general node slow down
15:43:47 <slaweq> lets maybe keep an eye on it and see if that will repeat
15:44:06 <bcafarel> or maybe some system issue there
15:44:12 <slaweq> maybe
15:44:16 <slaweq> :)
15:44:19 <slaweq> ok, next one:
15:44:21 <slaweq> https://868396e386f8d4e621fb-ed8719f87969200ae690156046a9dd5f.ssl.cf1.rackcdn.com/774621/1/check/neutron-functional-with-uwsgi/e6c177b/testr_results.html
15:45:25 <slaweq> strange
15:45:31 <slaweq> this time log is again empty
15:46:06 <slaweq> anyone wants to check why those functional tests logs are empty?
15:47:22 <bcafarel> I can dig a bit (at both actually)
15:47:30 <slaweq> thx
15:47:46 <slaweq> #action bcafarel to check why functional tests logs files are empty
15:48:32 <slaweq> ok, now fullstack
15:48:57 <slaweq> I saw at least twice this week errors with IP allocation in subnet, like:
15:48:59 <slaweq> https://d0a5505ffe75322ff4b4-851002f2a2f4fd257d2d88d1bd1bf1ab.ssl.cf1.rackcdn.com/765299/4/check/neutron-tempest-plugin-scenario-openvswitch-iptables_hybrid/cb7f73c/testr_results.html
15:49:02 <slaweq> and
15:49:10 <slaweq> https://e2787c106aae2566c67a-4ba9660b2d50a57315db5ad5fefc4cbb.ssl.cf1.rackcdn.com/761420/5/check/neutron-fullstack-with-uwsgi/53ca42e/testr_results.html
15:49:29 <slaweq> probably it's just a slow down of the node and some timeout hit
15:49:33 <slaweq> but we should check that
15:49:37 <slaweq> any volunteer?
15:50:56 <slaweq> I can check that
15:51:14 <slaweq> #action slaweq to check ip allocation failures in fullstack tests
15:51:31 <slaweq> ok, and one more issue
15:51:37 <slaweq> this time with scenario job
15:51:39 <slaweq> #topic Tempest/Scenario
15:51:43 <slaweq> Unable to finish operation on subnet...
15:51:45 <slaweq> https://d0a5505ffe75322ff4b4-851002f2a2f4fd257d2d88d1bd1bf1ab.ssl.cf1.rackcdn.com/765299/4/check/neutron-tempest-plugin-scenario-openvswitch-iptables_hybrid/cb7f73c/testr_results.html
15:51:52 <slaweq> I saw similar issues couple of times recently
15:52:05 <slaweq> it's some issue during the cleanup
15:52:34 <slaweq> anyone wants to check why there was such error there?
15:53:37 <lajoskatona> I can check
15:53:42 <slaweq> thx lajoskatona
15:53:57 <slaweq> #action lajoskatona to check subnet in use errors in scenario jobs
15:54:09 <slaweq> and that's all what I have for today
15:54:16 <slaweq> do You have anything else You want to discuss?
15:54:40 <bcafarel> nothing from me
15:54:40 <slaweq> if not, I will give You back few minutes today
15:55:35 <slaweq> ok, so thx for attending
15:55:54 <slaweq> have a great week and only +1 and +2 from Zuul ;)
15:55:58 <slaweq> #endmeeting